From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>, Carlos Maiolino <cem@kernel.org>,
Dave Chinner <dchinner@redhat.com>,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 10/12] xfs: use vmalloc instead of vm_map_area for buffer backing memory
Date: Thu, 6 Mar 2025 10:28:30 +1100 [thread overview]
Message-ID: <Z8jeHjpn_VTjMFCg@dread.disaster.area> (raw)
In-Reply-To: <20250305225407.GM2803749@frogsfrogsfrogs>
On Wed, Mar 05, 2025 at 02:54:07PM -0800, Darrick J. Wong wrote:
> On Thu, Mar 06, 2025 at 08:20:08AM +1100, Dave Chinner wrote:
> > On Wed, Mar 05, 2025 at 07:05:27AM -0700, Christoph Hellwig wrote:
> > > The fallback buffer allocation path currently open codes a suboptimal
> > > version of vmalloc to allocate pages that are then mapped into
> > > vmalloc space. Switch to using vmalloc instead, which uses all the
> > > optimizations in the common vmalloc code, and removes the need to
> > > track the backing pages in the xfs_buf structure.
> > >
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > .....
> >
> > > @@ -1500,29 +1373,43 @@ static void
> > > xfs_buf_submit_bio(
> > > struct xfs_buf *bp)
> > > {
> > > - unsigned int size = BBTOB(bp->b_length);
> > > - unsigned int map = 0, p;
> > > + unsigned int map = 0;
> > > struct blk_plug plug;
> > > struct bio *bio;
> > >
> > > - bio = bio_alloc(bp->b_target->bt_bdev, bp->b_page_count,
> > > - xfs_buf_bio_op(bp), GFP_NOIO);
> > > - bio->bi_private = bp;
> > > - bio->bi_end_io = xfs_buf_bio_end_io;
> > > + if (is_vmalloc_addr(bp->b_addr)) {
> > > + unsigned int size = BBTOB(bp->b_length);
> > > + unsigned int alloc_size = roundup(size, PAGE_SIZE);
> > > + void *data = bp->b_addr;
> > >
> > > - if (bp->b_page_count == 1) {
> > > - __bio_add_page(bio, virt_to_page(bp->b_addr), size,
> > > - offset_in_page(bp->b_addr));
> > > - } else {
> > > - for (p = 0; p < bp->b_page_count; p++)
> > > - __bio_add_page(bio, bp->b_pages[p], PAGE_SIZE, 0);
> > > - bio->bi_iter.bi_size = size; /* limit to the actual size used */
> > > + bio = bio_alloc(bp->b_target->bt_bdev, alloc_size >> PAGE_SHIFT,
> > > + xfs_buf_bio_op(bp), GFP_NOIO);
> > > +
> > > + do {
> > > + unsigned int len = min(size, PAGE_SIZE);
> > >
> > > - if (is_vmalloc_addr(bp->b_addr))
> > > - flush_kernel_vmap_range(bp->b_addr,
> > > - xfs_buf_vmap_len(bp));
> > > + ASSERT(offset_in_page(data) == 0);
> > > + __bio_add_page(bio, vmalloc_to_page(data), len, 0);
> > > + data += len;
> > > + size -= len;
> > > + } while (size);
> > > +
> > > + flush_kernel_vmap_range(bp->b_addr, alloc_size);
> > > + } else {
> > > + /*
> > > + * Single folio or slab allocation. Must be contiguous and thus
> > > + * only a single bvec is needed.
> > > + */
> > > + bio = bio_alloc(bp->b_target->bt_bdev, 1, xfs_buf_bio_op(bp),
> > > + GFP_NOIO);
> > > + __bio_add_page(bio, virt_to_page(bp->b_addr),
> > > + BBTOB(bp->b_length),
> > > + offset_in_page(bp->b_addr));
> > > }
> >
> > How does offset_in_page() work with a high order folio? It can only
> > return a value between 0 and (PAGE_SIZE - 1). i.e. shouldn't this
> > be:
> >
> > folio = kmem_to_folio(bp->b_addr);
> >
> > bio_add_folio_nofail(bio, folio, BBTOB(bp->b_length),
> > offset_in_folio(folio, bp->b_addr));
>
> I think offset_in_folio() returns 0 in the !kmem && !vmalloc case
> because we allocate the folio and set b_addr to folio_address(folio);
> and we never call the kmem alloc code for sizes greater than PAGE_SIZE.
Yes, but that misses my point: this is a folio conversion, whilst
this treats a folio as a page. We're trying to get rid of this sort
of page/folio type confusion (i.e. stuff like "does offset_in_page()
work correctly on large folios"). New code shouldn't be adding
new issues like these, especially when there are existing
folio-based APIs that are guaranteed to work correctly and won't
need fixing in future before pages and folios can be fully
separated.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2025-03-05 23:28 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-05 14:05 use folios and vmalloc for buffer cache backing memory v2 Christoph Hellwig
2025-03-05 14:05 ` [PATCH 01/12] xfs: unmapped buffer item size straddling mismatch Christoph Hellwig
2025-03-05 14:05 ` [PATCH 02/12] xfs: add a fast path to xfs_buf_zero when b_addr is set Christoph Hellwig
2025-03-05 14:05 ` [PATCH 03/12] xfs: remove xfs_buf.b_offset Christoph Hellwig
2025-03-05 14:05 ` [PATCH 04/12] xfs: remove xfs_buf_is_vmapped Christoph Hellwig
2025-03-05 14:05 ` [PATCH 05/12] xfs: refactor backing memory allocations for buffers Christoph Hellwig
2025-03-05 14:05 ` [PATCH 06/12] xfs: remove the kmalloc to page allocator fallback Christoph Hellwig
2025-03-05 18:18 ` Darrick J. Wong
2025-03-05 23:32 ` Christoph Hellwig
2025-03-05 21:02 ` Dave Chinner
2025-03-05 23:38 ` Christoph Hellwig
2025-03-05 14:05 ` [PATCH 07/12] xfs: convert buffer cache to use high order folios Christoph Hellwig
2025-03-05 18:20 ` Darrick J. Wong
2025-03-05 20:50 ` Dave Chinner
2025-03-05 23:33 ` Christoph Hellwig
2025-03-10 13:18 ` Christoph Hellwig
2025-03-05 14:05 ` [PATCH 08/12] xfs: kill XBF_UNMAPPED Christoph Hellwig
2025-03-05 14:05 ` [PATCH 09/12] xfs: buffer items don't straddle pages anymore Christoph Hellwig
2025-03-05 14:05 ` [PATCH 10/12] xfs: use vmalloc instead of vm_map_area for buffer backing memory Christoph Hellwig
2025-03-05 18:22 ` Darrick J. Wong
2025-03-05 21:20 ` Dave Chinner
2025-03-05 22:54 ` Darrick J. Wong
2025-03-05 23:28 ` Dave Chinner [this message]
2025-03-05 23:45 ` Christoph Hellwig
2025-03-05 23:35 ` Christoph Hellwig
2025-03-06 0:57 ` Dave Chinner
2025-03-06 1:40 ` Christoph Hellwig
2025-03-05 14:05 ` [PATCH 11/12] xfs: cleanup mapping tmpfs folios into the buffer cache Christoph Hellwig
2025-03-05 18:34 ` Darrick J. Wong
2025-03-05 14:05 ` [PATCH 12/12] xfs: trace what memory backs a buffer Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2025-03-10 13:19 use folios and vmalloc for buffer cache backing memory v3 Christoph Hellwig
2025-03-10 13:19 ` [PATCH 10/12] xfs: use vmalloc instead of vm_map_area for buffer backing memory Christoph Hellwig
2025-02-26 15:51 use folios and vmalloc for buffer cache " Christoph Hellwig
2025-02-26 15:51 ` [PATCH 10/12] xfs: use vmalloc instead of vm_map_area for buffer " Christoph Hellwig
2025-02-26 18:02 ` Darrick J. Wong
2025-03-04 14:10 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z8jeHjpn_VTjMFCg@dread.disaster.area \
--to=david@fromorbit.com \
--cc=cem@kernel.org \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox