All of lore.kernel.org
 help / color / mirror / Atom feed
From: Salvatore Dipietro <dipiets@amazon.it>
To: <linux-kernel@vger.kernel.org>
Cc: <ritesh.list@gmail.com>, <abuehaze@amazon.com>,
	<alisaidi@amazon.com>, <blakgeof@amazon.com>,
	<brauner@kernel.org>, <dipietro.salvatore@gmail.com>,
	<dipiets@amazon.it>, <djwong@kernel.org>,
	<linux-fsdevel@vger.kernel.org>, <linux-mm@kvack.org>,
	<linux-xfs@vger.kernel.org>, <stable@vger.kernel.org>,
	<willy@infradead.org>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH v2] mm/filemap: avoid costly reclaim for high-order folio allocations
Date: Mon, 20 Apr 2026 16:14:03 +0000	[thread overview]
Message-ID: <20260420161404.642-1-dipiets@amazon.it> (raw)

Commit 5d8edfb900d5 ("iomap: Copy larger chunks from userspace")
introduced high-order folio allocations in the buffered write path.
When memory is fragmented, each failed allocation above
PAGE_ALLOC_COSTLY_ORDER triggers compaction and drain_all_pages() via
__alloc_pages_slowpath(), causing a 0.75x throughput drop on pgbench
(simple-update) with  1024 clients on a 96-vCPU arm64 system.

In __filemap_get_folio(), for orders above min_order, split the
allocation behavior by cost:

 - For orders above PAGE_ALLOC_COSTLY_ORDER: strip
   __GFP_DIRECT_RECLAIM, making them purely opportunistic. The
   allocator tries the freelists only and returns NULL immediately if
   pages are not available.

 - For non-costly orders (between min_order and
   PAGE_ALLOC_COSTLY_ORDER): use __GFP_NORETRY to allow lightweight
   direct reclaim without expensive compaction retries.

With this patch, pgbench throughput recovers to 148k TPS (+67% vs
regressed baseline), stable across all iterations.

v2: 
- strip __GFP_DIRECT_RECLAIM to avoid costly reclaim for high-order
  folio allocations
- Moved fix from iomap to mm/filemap layer

Fixes: 5d8edfb900d5 ("iomap: Copy larger chunks from userspace")
Cc: stable@vger.kernel.org
Signed-off-by: Salvatore Dipietro <dipiets@amazon.it>
---
 mm/filemap.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 4e636647100c..f2343c26dd63 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2007,8 +2007,13 @@ struct folio *__filemap_get_folio_mpol(struct address_space *mapping,
 			gfp_t alloc_gfp = gfp;
 
 			err = -ENOMEM;
-			if (order > min_order)
-				alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN;
+			if (order > min_order) {
+				alloc_gfp |= __GFP_NOWARN;
+				if (order > PAGE_ALLOC_COSTLY_ORDER)
+					alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
+				else
+					alloc_gfp |= __GFP_NORETRY;
+			}
 			folio = filemap_alloc_folio(alloc_gfp, order, policy);
 			if (!folio)
 				continue;

base-commit: c7275b05bc428c7373d97aa2da02d3a7fa6b9f66
-- 
2.47.3




AMAZON DEVELOPMENT CENTER ITALY SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2504859, Capitale Sociale: 10.000 EUR i.v., Cod. Fisc. e P.IVA 10100050961, Societa con Socio Unico




             reply	other threads:[~2026-04-20 16:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20 16:14 Salvatore Dipietro [this message]
2026-04-20 16:51 ` [PATCH v2] mm/filemap: avoid costly reclaim for high-order folio allocations Andrew Morton
2026-04-20 18:41   ` Matthew Wilcox
2026-04-22  6:07     ` Christoph Hellwig
2026-04-20 19:12 ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260420161404.642-1-dipiets@amazon.it \
    --to=dipiets@amazon.it \
    --cc=abuehaze@amazon.com \
    --cc=akpm@linux-foundation.org \
    --cc=alisaidi@amazon.com \
    --cc=blakgeof@amazon.com \
    --cc=brauner@kernel.org \
    --cc=dipietro.salvatore@gmail.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.