From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Carlos Maiolino <cem@kernel.org>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/4] xfs: fix incorrect use of gfp flags in xfs_buf_alloc_backing_mem
Date: Mon, 29 Jun 2026 18:49:35 -0700 [thread overview]
Message-ID: <20260630014935.GI6078@frogsfrogsfrogs> (raw)
In-Reply-To: <20260617055814.3842058-4-hch@lst.de>
On Wed, Jun 17, 2026 at 07:58:04AM +0200, Christoph Hellwig wrote:
> xfs_buf_alloc_backing_mem currently has two issues with how the GFP_
> flags are set:
>
> - when aiming for a large folio allocation, the gfp mask is adjusted
> to try less hard, but these flags then persist for the vmalloc
> allocation, which is bogus.
> - the __GFP_NOFAIL for small allocations is also applied when readahead
> force __GFP_NORETRY which doesn't make any sense.
>
> Fix this by only applying __GFP_NOFAIL when __GFP_NORETRY is not set,
> and by reordering the code so that the large folio gfp adjustments
> are performed locally just for that allocation.
>
> Fixes: 94c78cfa3bd1 ("xfs: convert buffer cache to use high order folios")
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_buf.c | 49 +++++++++++++++++++++++-------------------------
> 1 file changed, 23 insertions(+), 26 deletions(-)
>
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index dce398337ad0..eea2a8757fe1 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -223,22 +223,6 @@ xfs_buf_alloc_backing_mem(
> if (flags & XBF_READ_AHEAD)
> gfp_mask |= __GFP_NORETRY;
>
> - /*
> - * For buffers smaller than PAGE_SIZE use a kmalloc allocation if that
> - * is properly aligned. The slab allocator now guarantees an aligned
> - * allocation for all power of two sizes, which matches most of the
> - * smaller than PAGE_SIZE buffers used by XFS.
> - */
> - if (size < PAGE_SIZE && is_power_of_2(size))
> - return xfs_buf_alloc_kmem(bp, size, gfp_mask | __GFP_NOFAIL);
> -
> - /*
> - * Don't bother with the retry loop for single PAGE allocations: vmalloc
> - * won't do any better.
> - */
> - if (size <= PAGE_SIZE)
> - gfp_mask |= __GFP_NOFAIL;
> -
> /*
> * Optimistically attempt a single high order folio allocation for
> * larger than PAGE_SIZE buffers.
> @@ -251,18 +235,31 @@ xfs_buf_alloc_backing_mem(
> * path for them instead of wasting memory here.
> */
> if (size > PAGE_SIZE) {
> - if (!is_power_of_2(size))
> - return xfs_buf_alloc_vmalloc(bp, size, gfp_mask, flags);
> - gfp_mask &= ~__GFP_DIRECT_RECLAIM;
> - gfp_mask |= __GFP_NORETRY;
> - }
> - if (xfs_buf_alloc_folio(bp, size, gfp_mask) < 0) {
> - if (size <= PAGE_SIZE)
> - return -ENOMEM;
> - trace_xfs_buf_backing_fallback(bp, _RET_IP_);
> + if (is_power_of_2(size)) {
> + gfp_t folio_gfp = gfp_mask;
> +
> + folio_gfp &= ~__GFP_DIRECT_RECLAIM;
> + folio_gfp |= __GFP_NORETRY;
> + if (xfs_buf_alloc_folio(bp, size, folio_gfp) == 0)
> + return 0;
> + trace_xfs_buf_backing_fallback(bp, _RET_IP_);
> + }
> return xfs_buf_alloc_vmalloc(bp, size, gfp_mask, flags);
> }
> - return 0;
> +
> + /*
> + * The slab allocator now guarantees aligned allocations for all power
> + * of two sizes. This covers most smaller XFS buffers, so just use
> + * kmalloc in this case.
> + *
> + * Don't bother with the vmalloc fallback for allocations of page size
> + * or less: vmalloc won't do any better.
> + */
> + if (!(gfp_mask & __GFP_NORETRY))
> + gfp_mask |= __GFP_NOFAIL;
> + if (size < PAGE_SIZE && is_power_of_2(size))
> + return xfs_buf_alloc_kmem(bp, size, gfp_mask);
> + return xfs_buf_alloc_folio(bp, size, gfp_mask);
Hrmm, ok. So first we special-case buffers > PAGE_SIZE -- if they're a
power of two, we try (not very hard) to allocate a single large folio.
If that fails or it's not a power-of-two, then we just do vmalloc, which
allows direct reclaim and retries.
For smaller than PAGE_SIZE buffers that are powers of two, I guess we
use the slab allocator, otherwise a full folio. That part strikes me as
a little strange (efficiency, I guess?), but that's what the code did
before so I guess it's ok.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> }
>
> static int
> --
> 2.53.0
>
>
next prev parent reply other threads:[~2026-06-30 1:49 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-17 5:58 fix GFP_ flag use in the buffer cache Christoph Hellwig
2026-06-17 5:58 ` [PATCH 1/4] xfs: split up xfs_buf_alloc_backing_mem Christoph Hellwig
2026-06-22 12:17 ` Carlos Maiolino
2026-06-30 0:52 ` Darrick J. Wong
2026-06-17 5:58 ` [PATCH 2/4] xfs: lift setting __GFP_NOFAIL from xfs_buf_alloc_kmem to the caller Christoph Hellwig
2026-06-22 12:22 ` Carlos Maiolino
2026-06-30 0:52 ` Darrick J. Wong
2026-07-01 9:31 ` Carlos Maiolino
2026-07-01 16:03 ` Darrick J. Wong
2026-06-17 5:58 ` [PATCH 3/4] xfs: fix incorrect use of gfp flags in xfs_buf_alloc_backing_mem Christoph Hellwig
2026-06-22 12:31 ` Carlos Maiolino
2026-06-30 1:49 ` Darrick J. Wong [this message]
2026-06-30 5:15 ` Christoph Hellwig
2026-06-17 5:58 ` [PATCH 4/4] xfs: simplify the failure path in xfs_buf_alloc_vmalloc Christoph Hellwig
2026-06-22 12:34 ` Carlos Maiolino
2026-06-24 7:41 ` Christoph Hellwig
2026-06-24 8:25 ` Carlos Maiolino
2026-06-30 1:51 ` Darrick J. Wong
2026-07-01 12:50 ` fix GFP_ flag use in the buffer cache Carlos Maiolino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260630014935.GI6078@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=cem@kernel.org \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox