From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 8/9] xfs: use vmalloc for multi-folio buffers
Date: Wed, 20 Mar 2024 11:20:06 +1100 [thread overview]
Message-ID: <ZfortpehJBhja0Cf@dread.disaster.area> (raw)
In-Reply-To: <20240319174819.GU1927156@frogsfrogsfrogs>
On Tue, Mar 19, 2024 at 10:48:19AM -0700, Darrick J. Wong wrote:
> On Tue, Mar 19, 2024 at 09:45:59AM +1100, Dave Chinner wrote:
> > From: Christoph Hellwig <hch@lst.de>
> >
> > Instead of allocating the folios manually using the bulk page
> > allocator and then using vm_map_page just use vmalloc to allocate
> > the entire buffer - vmalloc will use the bulk allocator internally
> > if it fits.
> >
> > With this the b_folios array can go away as well as nothing uses it.
> >
> > [dchinner: port to folio based buffers.]
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> > fs/xfs/xfs_buf.c | 164 ++++++++++++-------------------------------
> > fs/xfs/xfs_buf.h | 2 -
> > fs/xfs/xfs_buf_mem.c | 9 +--
> > 3 files changed, 45 insertions(+), 130 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index 303945554415..6d6bad80722e 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -282,29 +282,6 @@ _xfs_buf_alloc(
> > return 0;
> > }
> >
> > -static void
> > -xfs_buf_free_folios(
> > - struct xfs_buf *bp)
> > -{
> > - uint i;
> > -
> > - ASSERT(bp->b_flags & _XBF_FOLIOS);
> > -
> > - if (xfs_buf_is_vmapped(bp))
> > - vm_unmap_ram(bp->b_addr, bp->b_folio_count);
> > -
> > - for (i = 0; i < bp->b_folio_count; i++) {
> > - if (bp->b_folios[i])
> > - __folio_put(bp->b_folios[i]);
> > - }
> > - mm_account_reclaimed_pages(bp->b_folio_count);
> > -
> > - if (bp->b_folios != bp->b_folio_array)
> > - kfree(bp->b_folios);
> > - bp->b_folios = NULL;
> > - bp->b_flags &= ~_XBF_FOLIOS;
> > -}
> > -
> > static void
> > xfs_buf_free_callback(
> > struct callback_head *cb)
> > @@ -323,13 +300,22 @@ xfs_buf_free(
> >
> > ASSERT(list_empty(&bp->b_lru));
> >
> > - if (xfs_buftarg_is_mem(bp->b_target))
> > + if (xfs_buftarg_is_mem(bp->b_target)) {
> > xmbuf_unmap_folio(bp);
> > - else if (bp->b_flags & _XBF_FOLIOS)
> > - xfs_buf_free_folios(bp);
> > - else if (bp->b_flags & _XBF_KMEM)
> > - kfree(bp->b_addr);
> > + goto free;
> > + }
> >
> > + if (!(bp->b_flags & _XBF_KMEM))
> > + mm_account_reclaimed_pages(bp->b_folio_count);
>
> Echoing hch's statement about the argument being passed to
> mm_account_reclaimed_pages needing to be fed units of base pages, not
> folios.
>
> > +
> > + if (bp->b_flags & _XBF_FOLIOS)
> > + __folio_put(kmem_to_folio(bp->b_addr));
>
> Is it necessary to use folio_put instead of the __ version like hch said
> earlier?
Both fixed.
>
> > + else
> > + kvfree(bp->b_addr);
> > +
> > + bp->b_flags &= _XBF_KMEM | _XBF_FOLIOS;
>
> Shouldn't this be:
>
> bp->b_flags &= ~(_XBF_KMEM | _XBF_FOLIOS); ?
Yes. Good catch.
> > @@ -377,14 +361,15 @@ xfs_buf_alloc_folio(
> > struct xfs_buf *bp,
> > gfp_t gfp_mask)
> > {
> > + struct folio *folio;
> > int length = BBTOB(bp->b_length);
> > int order = get_order(length);
> >
> > - bp->b_folio_array[0] = folio_alloc(gfp_mask, order);
> > - if (!bp->b_folio_array[0])
> > + folio = folio_alloc(gfp_mask, order);
> > + if (!folio)
> > return false;
> >
> > - bp->b_folios = bp->b_folio_array;
> > + bp->b_addr = folio_address(folio);
> > bp->b_folio_count = 1;
> > bp->b_flags |= _XBF_FOLIOS;
> > return true;
> > @@ -400,15 +385,11 @@ xfs_buf_alloc_folio(
> > * contiguous memory region that we don't have to map and unmap to access the
> > * data directly.
> > *
> > - * The second type of buffer is the multi-folio buffer. These are *always* made
> > - * up of single page folios so that they can be fed to vmap_ram() to return a
> > - * contiguous memory region we can access the data through.
> > - *
> > - * We don't use high order folios for this second type of buffer (yet) because
> > - * having variable size folios makes offset-to-folio indexing and iteration of
> > - * the data range more complex than if they are fixed size. This case should now
> > - * be the slow path, though, so unless we regularly fail to allocate high order
> > - * folios, there should be little need to optimise this path.
> > + * The second type of buffer is the vmalloc()d buffer. This provides the buffer
> > + * with the required contiguous memory region but backed by discontiguous
> > + * physical pages. vmalloc() typically doesn't fail, but it can and so we may
> > + * need to wrap the allocation in a loop to prevent low memory failures and
> > + * shutdowns.
>
> Where's the loop now? Is that buried under __vmalloc somewhere?
I thought I'd added __GFP_NOFAIL to the __vmalloc() gfp mask to make
it loop. I suspect I lost it at some point when rebasing either this
or the (now merged) kmem.[ch] removal patchset.
Well spotted, I'll fix that up.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-03-22 0:55 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-18 22:45 [PATCH v2 0/9] xfs: use large folios for buffers Dave Chinner
2024-03-18 22:45 ` [PATCH 1/9] xfs: unmapped buffer item size straddling mismatch Dave Chinner
2024-03-18 22:45 ` [PATCH 2/9] xfs: use folios in the buffer cache Dave Chinner
2024-03-19 6:38 ` Christoph Hellwig
2024-03-19 6:52 ` Dave Chinner
2024-03-19 6:53 ` Christoph Hellwig
2024-03-19 21:42 ` Dave Chinner
2024-03-19 21:42 ` Dave Chinner
2024-03-19 17:15 ` Darrick J. Wong
2024-03-18 22:45 ` [PATCH 3/9] xfs: convert buffer cache to use high order folios Dave Chinner
2024-03-19 6:55 ` Christoph Hellwig
2024-03-19 17:29 ` Darrick J. Wong
2024-03-19 21:32 ` Christoph Hellwig
2024-03-19 21:38 ` Darrick J. Wong
2024-03-19 21:41 ` Christoph Hellwig
2024-03-19 22:23 ` Dave Chinner
2024-03-21 2:12 ` Darrick J. Wong
2024-03-21 2:40 ` Darrick J. Wong
2024-03-21 21:28 ` Christoph Hellwig
2024-03-21 21:39 ` Darrick J. Wong
2024-03-21 22:02 ` Christoph Hellwig
2024-03-19 21:55 ` Dave Chinner
2024-03-22 8:02 ` Pankaj Raghav (Samsung)
2024-03-22 22:04 ` Dave Chinner
2024-03-25 11:17 ` Pankaj Raghav (Samsung)
2024-03-18 22:45 ` [PATCH 4/9] xfs: kill XBF_UNMAPPED Dave Chinner
2024-03-19 17:30 ` Darrick J. Wong
2024-03-19 23:36 ` Dave Chinner
2024-03-18 22:45 ` [PATCH 5/9] xfs: buffer items don't straddle pages anymore Dave Chinner
2024-03-19 6:56 ` Christoph Hellwig
2024-03-19 17:31 ` Darrick J. Wong
2024-03-18 22:45 ` [PATCH 6/9] xfs: map buffers in xfs_buf_alloc_folios Dave Chinner
2024-03-19 17:34 ` Darrick J. Wong
2024-03-19 21:32 ` Christoph Hellwig
2024-03-19 21:39 ` Darrick J. Wong
2024-03-19 21:41 ` Christoph Hellwig
2024-03-18 22:45 ` [PATCH 7/9] xfs: walk b_addr for buffer I/O Dave Chinner
2024-03-19 17:42 ` Darrick J. Wong
2024-03-19 21:33 ` Christoph Hellwig
2024-03-18 22:45 ` [PATCH 8/9] xfs: use vmalloc for multi-folio buffers Dave Chinner
2024-03-19 17:48 ` Darrick J. Wong
2024-03-20 0:20 ` Dave Chinner [this message]
2024-03-18 22:46 ` [PATCH 9/9] xfs: rename bp->b_folio_count Dave Chinner
2024-03-19 7:37 ` Christoph Hellwig
2024-03-19 23:59 ` Dave Chinner
2024-03-19 0:24 ` [PATCH v2 0/9] xfs: use large folios for buffers Christoph Hellwig
2024-03-19 0:44 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZfortpehJBhja0Cf@dread.disaster.area \
--to=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.