From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 990E839A81A for ; Thu, 16 Apr 2026 14:51:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776351069; cv=none; b=R0skcmnDA6RZqztUYs9HVnjKCGaNSd5zYXTk13h5mgqjabtUg2M0ZXeqs8WDDa5aMJu4xf6q0bPNd04K6sbRPaCUmqgna8W8sjL6NRWeChtgqWmVpTsD2OgppHUJ4XWItydUtcI5bgcfaYnWEjOpPG6z65yi80wPZcHrRQQpHIQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776351069; c=relaxed/simple; bh=zWwTqlMGP7ah8FYMKRURrQLmA8h/Etjj7HlpFvRZ9DQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VRw8H1w/bxxkT3dOj7HPWAxCuozxbxbDVX3l63kIlPqQ2jC4ynsGAp603LDneTTf8sQhR1t16ODjirogZSOH4ViQNuoW8cY40E/dnHu65aoPFlM0RIi6RIYtUorQGSzHqq82uSBFvhwgO7nXSK1+Y0me/g6Wa+5Xt2LpRtvLucM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=readmodwrite.com; spf=none smtp.mailfrom=readmodwrite.com; dkim=pass (2048-bit key) header.d=readmodwrite-com.20251104.gappssmtp.com header.i=@readmodwrite-com.20251104.gappssmtp.com header.b=zndna0kR; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=readmodwrite.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=readmodwrite.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=readmodwrite-com.20251104.gappssmtp.com header.i=@readmodwrite-com.20251104.gappssmtp.com header.b="zndna0kR" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-488b3f8fa2bso7303145e9.1 for ; Thu, 16 Apr 2026 07:51:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=readmodwrite-com.20251104.gappssmtp.com; s=20251104; t=1776351066; x=1776955866; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=FMRz/WI21Uh5Gr86fIIlkO6L3sSSjv9gzE39RNxVDRw=; b=zndna0kReJL9b4wfkXZtIN8A6LfBxbdibUyjKOOAxo4CFbjVpjCB2GQQ3CnsP/aqhr IBXdhTPxCr6oBEkfkzLY+Fpk9bY9uaVPNoG7OxygIxB8vN5eOzrnWiWhEFUSalerq1Qm mFv7W3C905URxYqV7H4hy7Vzy0IhKe5v0weLRQPsC+QGA4034LXPrr+SzotpE5/b2yKA 6SSY58IcYfojTBWEUcQDzom3D/QxcYQWZUsjj1+IgJeUB1H4bcLetvZooLE7lAf8C3LQ LMeIwiYFTDhIfdNRc20Iv9JGjMK9746o7xlQwUY5pBRnzz1Sr9zMmK2B2NM/RkvZG8yC hiPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776351066; x=1776955866; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FMRz/WI21Uh5Gr86fIIlkO6L3sSSjv9gzE39RNxVDRw=; b=goz8dV8XUp4mCzzsGb6mG+8Sxm2I2jGGB+MrQ5HTijS8FIrMpugQzADVaKUJp5JQwu 0vGfiPClsPXX+cq/qcOSCcykAuZcuVtro2ow3Or4j0hDgE4i7ddIRBs0Hhd6TYTfdAA8 Ju7gMh3b+fLluAcd2bYwHpWDRSrcbkJZPRfnuXkPuYtHV8Ka7iBmmcaUE12uw1sgkcvE Bv+GhutEEAPQPRBHdeiB+ux+dSE8UDKzfJa9qrp0D8U3B1p6Ap2Gl/wPgilnAyq2XO81 4dHKpvAwrE60a2HEDhQS8PTMWqYlg7uE/ks0JEagraoYkGlgLPUUuyzK+sbrIcThxXzK WSPQ== X-Forwarded-Encrypted: i=1; AFNElJ/vDKvm+wWXGDfnAHz3ZSdZp4mF/kGGeomfGH3vnuPh201j6BuwGFhAjPmhe/Fl1JWGmqg1WF+2g3v1mSM=@vger.kernel.org X-Gm-Message-State: AOJu0Yy1Cu3/BT5inH0ir750UxZzlFhnE6pSfX5meLDmoXyj4MhIkwuH GDWTzZVxwjXjFrZB5dvQ7U+4PW2GbVSlr/Kycpd/rS4KC+jGSDrogvwQ06pDKY2zfEA= X-Gm-Gg: AeBDietU4rlDGLOdk7DmUPSyM5tnDpw3q96zfPOmV3P3jogFUWRyE7pR6pz2xogg4h5 PKpdEUREq5gJvJhkduqoZUQv+H4C5MCqL9f540mmzLSV6vfmSJu7GLPfa6VYV3ZKZoUVZM+75BT pI/mHhXdfb6WnWJLUWHoFORBJoKvCo+ZT4FkEnoWfH1+xGZatcWrKk/e8GG+V2hYzDYpPkHrb3d dC3httl5Utq1NWetJ7pi2y/+V44FDnrEkGGGTOEml+vfEigvONYtn8YXhOuQEQF2SmVr1msYBOG SqKcHR/OEYPAo/I3EJWOwpM8j/4kaI7Ws6TN/2xUzGXKkSIwGaltn4SdHsmZKHsSYl3ixiaLlCp 5C1GP0Plsw0RSaMjsDIPNQliuNyH6BLozM174k9xJeKFzZd86KkjRTgVln8CDjqg+HBq35J3WmC 2VsXSqHaC9e065USh0ow== X-Received: by 2002:a05:600c:8582:b0:485:3c2e:60d5 with SMTP id 5b1f17b1804b1-488f45cc724mr39994985e9.2.1776351065748; Thu, 16 Apr 2026 07:51:05 -0700 (PDT) Received: from localhost ([2a09:bac6:37a8:d2::15:40d]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-488f0e9c668sm75209365e9.4.2026.04.16.07.51.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Apr 2026 07:51:05 -0700 (PDT) Date: Thu, 16 Apr 2026 15:51:04 +0100 From: Matt Fleming To: Pedro Falcato Cc: Andrew Morton , Christoph Hellwig , Jens Axboe , Sergey Senozhatsky , Roman Gushchin , Minchan Kim , kernel-team@cloudflare.com, Matt Fleming , Johannes Weiner , Chris Li , Kairui Song , Kemeng Shi , Nhat Pham , Baoquan He , Barry Song , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Zi Yan , Axel Rasmussen , Yuanchu Xie , Wei Xu , David Hildenbrand , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: Require LRU reclaim progress before retrying direct reclaim Message-ID: References: <20260410101550.2930139-1-matt@readmodwrite.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Wed, Apr 15, 2026 at 03:57:25PM +0100, Pedro Falcato wrote: > On Fri, Apr 10, 2026 at 11:15:49AM +0100, Matt Fleming wrote: > > From: Matt Fleming > > > > should_reclaim_retry() uses zone_reclaimable_pages() to estimate whether > > retrying reclaim could eventually satisfy an allocation. It's possible > > for reclaim to make minimal or no progress on an LRU type despite having > > ample reclaimable pages, e.g. anonymous pages when the only swap is > > RAM-backed (zram). This can cause the reclaim path to loop indefinitely. > > > > Track LRU reclaim progress (anon vs file) through a new struct > > reclaim_progress passed out of try_to_free_pages(), and only count a > > type's reclaimable pages if at least reclaim_progress_pct% was actually > > reclaimed in the last cycle. > > I think there is at least one problem with this heuristic: you are counting > everything that hasn't made progress as "we cannot reclaim it". When in reality > you can simply fail to make progress on any given folio as e.g it's referenced > and we want to give it another spin in the LRU. The intention was that the percentage threshold would avoid giving up on reclaim as long as "sufficient" progress was made. This should allow for some folios to need another trip through the LRU but... > My theory (from merely reading the patch, maybe I missed something) is that > a pathological case for this is a lot of folios added to the LRU in a row, > that are set referenced (or dirty). Say SWAP_CLUSTER_MAX * MAX_RECLAIM_RETRIES > - it will simply OOM too early. OK yeah I think I see the problem now: this heuristic applies the threshold against all reclaimable pages but that falls apart when doing SWAP_CLUSTER_MAX chunks of reclaim. > The other question is whether this effectively solves reclaim problems - some > hard numbers would be great. I shared some numbers in my reply to Vlastimil, but if there are other cases you'd like measured I'm happy to run them.