public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Salvatore Dipietro <dipiets@amazon.it>
To: <linux-kernel@vger.kernel.org>
Cc: <ritesh.list@gmail.com>, <abuehaze@amazon.com>,
	<alisaidi@amazon.com>, <blakgeof@amazon.com>,
	<brauner@kernel.org>, <dipietro.salvatore@gmail.com>,
	<dipiets@amazon.it>, <djwong@kernel.org>,
	<linux-fsdevel@vger.kernel.org>, <linux-mm@kvack.org>,
	<linux-xfs@vger.kernel.org>, <stable@vger.kernel.org>,
	<willy@infradead.org>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH v2] mm/filemap: avoid costly reclaim for high-order folio allocations
Date: Mon, 20 Apr 2026 16:14:03 +0000	[thread overview]
Message-ID: <20260420161404.642-1-dipiets@amazon.it> (raw)

Commit 5d8edfb900d5 ("iomap: Copy larger chunks from userspace")
introduced high-order folio allocations in the buffered write path.
When memory is fragmented, each failed allocation above
PAGE_ALLOC_COSTLY_ORDER triggers compaction and drain_all_pages() via
__alloc_pages_slowpath(), causing a 0.75x throughput drop on pgbench
(simple-update) with  1024 clients on a 96-vCPU arm64 system.

In __filemap_get_folio(), for orders above min_order, split the
allocation behavior by cost:

 - For orders above PAGE_ALLOC_COSTLY_ORDER: strip
   __GFP_DIRECT_RECLAIM, making them purely opportunistic. The
   allocator tries the freelists only and returns NULL immediately if
   pages are not available.

 - For non-costly orders (between min_order and
   PAGE_ALLOC_COSTLY_ORDER): use __GFP_NORETRY to allow lightweight
   direct reclaim without expensive compaction retries.

With this patch, pgbench throughput recovers to 148k TPS (+67% vs
regressed baseline), stable across all iterations.

v2: 
- strip __GFP_DIRECT_RECLAIM to avoid costly reclaim for high-order
  folio allocations
- Moved fix from iomap to mm/filemap layer

Fixes: 5d8edfb900d5 ("iomap: Copy larger chunks from userspace")
Cc: stable@vger.kernel.org
Signed-off-by: Salvatore Dipietro <dipiets@amazon.it>
---
 mm/filemap.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 4e636647100c..f2343c26dd63 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2007,8 +2007,13 @@ struct folio *__filemap_get_folio_mpol(struct address_space *mapping,
 			gfp_t alloc_gfp = gfp;
 
 			err = -ENOMEM;
-			if (order > min_order)
-				alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN;
+			if (order > min_order) {
+				alloc_gfp |= __GFP_NOWARN;
+				if (order > PAGE_ALLOC_COSTLY_ORDER)
+					alloc_gfp &= ~__GFP_DIRECT_RECLAIM;
+				else
+					alloc_gfp |= __GFP_NORETRY;
+			}
 			folio = filemap_alloc_folio(alloc_gfp, order, policy);
 			if (!folio)
 				continue;

base-commit: c7275b05bc428c7373d97aa2da02d3a7fa6b9f66
-- 
2.47.3




AMAZON DEVELOPMENT CENTER ITALY SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2504859, Capitale Sociale: 10.000 EUR i.v., Cod. Fisc. e P.IVA 10100050961, Societa con Socio Unico




             reply	other threads:[~2026-04-20 16:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20 16:14 Salvatore Dipietro [this message]
2026-04-20 16:51 ` [PATCH v2] mm/filemap: avoid costly reclaim for high-order folio allocations Andrew Morton
2026-04-20 18:41   ` Matthew Wilcox
2026-04-20 19:12 ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260420161404.642-1-dipiets@amazon.it \
    --to=dipiets@amazon.it \
    --cc=abuehaze@amazon.com \
    --cc=akpm@linux-foundation.org \
    --cc=alisaidi@amazon.com \
    --cc=blakgeof@amazon.com \
    --cc=brauner@kernel.org \
    --cc=dipietro.salvatore@gmail.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox