public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>, Carlos Maiolino <cem@kernel.org>,
	Dave Chinner <dchinner@redhat.com>,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 10/12] xfs: use vmalloc instead of vm_map_area for buffer backing memory
Date: Thu, 6 Mar 2025 10:28:30 +1100	[thread overview]
Message-ID: <Z8jeHjpn_VTjMFCg@dread.disaster.area> (raw)
In-Reply-To: <20250305225407.GM2803749@frogsfrogsfrogs>

On Wed, Mar 05, 2025 at 02:54:07PM -0800, Darrick J. Wong wrote:
> On Thu, Mar 06, 2025 at 08:20:08AM +1100, Dave Chinner wrote:
> > On Wed, Mar 05, 2025 at 07:05:27AM -0700, Christoph Hellwig wrote:
> > > The fallback buffer allocation path currently open codes a suboptimal
> > > version of vmalloc to allocate pages that are then mapped into
> > > vmalloc space.  Switch to using vmalloc instead, which uses all the
> > > optimizations in the common vmalloc code, and removes the need to
> > > track the backing pages in the xfs_buf structure.
> > > 
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > .....
> > 
> > > @@ -1500,29 +1373,43 @@ static void
> > >  xfs_buf_submit_bio(
> > >  	struct xfs_buf		*bp)
> > >  {
> > > -	unsigned int		size = BBTOB(bp->b_length);
> > > -	unsigned int		map = 0, p;
> > > +	unsigned int		map = 0;
> > >  	struct blk_plug		plug;
> > >  	struct bio		*bio;
> > >  
> > > -	bio = bio_alloc(bp->b_target->bt_bdev, bp->b_page_count,
> > > -			xfs_buf_bio_op(bp), GFP_NOIO);
> > > -	bio->bi_private = bp;
> > > -	bio->bi_end_io = xfs_buf_bio_end_io;
> > > +	if (is_vmalloc_addr(bp->b_addr)) {
> > > +		unsigned int	size = BBTOB(bp->b_length);
> > > +		unsigned int	alloc_size = roundup(size, PAGE_SIZE);
> > > +		void		*data = bp->b_addr;
> > >  
> > > -	if (bp->b_page_count == 1) {
> > > -		__bio_add_page(bio, virt_to_page(bp->b_addr), size,
> > > -				offset_in_page(bp->b_addr));
> > > -	} else {
> > > -		for (p = 0; p < bp->b_page_count; p++)
> > > -			__bio_add_page(bio, bp->b_pages[p], PAGE_SIZE, 0);
> > > -		bio->bi_iter.bi_size = size; /* limit to the actual size used */
> > > +		bio = bio_alloc(bp->b_target->bt_bdev, alloc_size >> PAGE_SHIFT,
> > > +				xfs_buf_bio_op(bp), GFP_NOIO);
> > > +
> > > +		do {
> > > +			unsigned int	len = min(size, PAGE_SIZE);
> > >  
> > > -		if (is_vmalloc_addr(bp->b_addr))
> > > -			flush_kernel_vmap_range(bp->b_addr,
> > > -					xfs_buf_vmap_len(bp));
> > > +			ASSERT(offset_in_page(data) == 0);
> > > +			__bio_add_page(bio, vmalloc_to_page(data), len, 0);
> > > +			data += len;
> > > +			size -= len;
> > > +		} while (size);
> > > +
> > > +		flush_kernel_vmap_range(bp->b_addr, alloc_size);
> > > +	} else {
> > > +		/*
> > > +		 * Single folio or slab allocation.  Must be contiguous and thus
> > > +		 * only a single bvec is needed.
> > > +		 */
> > > +		bio = bio_alloc(bp->b_target->bt_bdev, 1, xfs_buf_bio_op(bp),
> > > +				GFP_NOIO);
> > > +		__bio_add_page(bio, virt_to_page(bp->b_addr),
> > > +				BBTOB(bp->b_length),
> > > +				offset_in_page(bp->b_addr));
> > >  	}
> > 
> > How does offset_in_page() work with a high order folio? It can only
> > return a value between 0 and (PAGE_SIZE - 1). i.e. shouldn't this
> > be:
> > 
> > 		folio = kmem_to_folio(bp->b_addr);
> > 
> > 		bio_add_folio_nofail(bio, folio, BBTOB(bp->b_length),
> > 				offset_in_folio(folio, bp->b_addr));
> 
> I think offset_in_folio() returns 0 in the !kmem && !vmalloc case
> because we allocate the folio and set b_addr to folio_address(folio);
> and we never call the kmem alloc code for sizes greater than PAGE_SIZE.

Yes, but that misses my point: this is a folio conversion, whilst
this treats a folio as a page. We're trying to get rid of this sort
of page/folio type confusion (i.e. stuff like "does offset_in_page()
work correctly on large folios"). New code shouldn't be adding
new issues like these, especially when there are existing
folio-based APIs that are guaranteed to work correctly and won't
need fixing in future before pages and folios can be fully
separated.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2025-03-05 23:28 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-05 14:05 use folios and vmalloc for buffer cache backing memory v2 Christoph Hellwig
2025-03-05 14:05 ` [PATCH 01/12] xfs: unmapped buffer item size straddling mismatch Christoph Hellwig
2025-03-05 14:05 ` [PATCH 02/12] xfs: add a fast path to xfs_buf_zero when b_addr is set Christoph Hellwig
2025-03-05 14:05 ` [PATCH 03/12] xfs: remove xfs_buf.b_offset Christoph Hellwig
2025-03-05 14:05 ` [PATCH 04/12] xfs: remove xfs_buf_is_vmapped Christoph Hellwig
2025-03-05 14:05 ` [PATCH 05/12] xfs: refactor backing memory allocations for buffers Christoph Hellwig
2025-03-05 14:05 ` [PATCH 06/12] xfs: remove the kmalloc to page allocator fallback Christoph Hellwig
2025-03-05 18:18   ` Darrick J. Wong
2025-03-05 23:32     ` Christoph Hellwig
2025-03-05 21:02   ` Dave Chinner
2025-03-05 23:38     ` Christoph Hellwig
2025-03-05 14:05 ` [PATCH 07/12] xfs: convert buffer cache to use high order folios Christoph Hellwig
2025-03-05 18:20   ` Darrick J. Wong
2025-03-05 20:50   ` Dave Chinner
2025-03-05 23:33     ` Christoph Hellwig
2025-03-10 13:18     ` Christoph Hellwig
2025-03-05 14:05 ` [PATCH 08/12] xfs: kill XBF_UNMAPPED Christoph Hellwig
2025-03-05 14:05 ` [PATCH 09/12] xfs: buffer items don't straddle pages anymore Christoph Hellwig
2025-03-05 14:05 ` [PATCH 10/12] xfs: use vmalloc instead of vm_map_area for buffer backing memory Christoph Hellwig
2025-03-05 18:22   ` Darrick J. Wong
2025-03-05 21:20   ` Dave Chinner
2025-03-05 22:54     ` Darrick J. Wong
2025-03-05 23:28       ` Dave Chinner [this message]
2025-03-05 23:45         ` Christoph Hellwig
2025-03-05 23:35     ` Christoph Hellwig
2025-03-06  0:57       ` Dave Chinner
2025-03-06  1:40         ` Christoph Hellwig
2025-03-05 14:05 ` [PATCH 11/12] xfs: cleanup mapping tmpfs folios into the buffer cache Christoph Hellwig
2025-03-05 18:34   ` Darrick J. Wong
2025-03-05 14:05 ` [PATCH 12/12] xfs: trace what memory backs a buffer Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2025-03-10 13:19 use folios and vmalloc for buffer cache backing memory v3 Christoph Hellwig
2025-03-10 13:19 ` [PATCH 10/12] xfs: use vmalloc instead of vm_map_area for buffer backing memory Christoph Hellwig
2025-02-26 15:51 use folios and vmalloc for buffer cache " Christoph Hellwig
2025-02-26 15:51 ` [PATCH 10/12] xfs: use vmalloc instead of vm_map_area for buffer " Christoph Hellwig
2025-02-26 18:02   ` Darrick J. Wong
2025-03-04 14:10     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8jeHjpn_VTjMFCg@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=cem@kernel.org \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox