[PATCH 0/2] btrfs: two small and safe fixes for large folios

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] btrfs: two small and safe fixes for large folios
@ 2025-04-01  6:12 Qu Wenruo
  2025-04-01  6:12 ` [PATCH 1/2] btrfs: fix the ASSERT() inside GET_SUBPAGE_BITMAP() Qu Wenruo
  2025-04-01  6:12 ` [PATCH 2/2] btrfs: fix the file offset calculation inside btrfs_decompress_buf2page() Qu Wenruo
  0 siblings, 2 replies; 5+ messages in thread
From: Qu Wenruo @ 2025-04-01  6:12 UTC (permalink / raw)
  To: linux-btrfs

Two small and simple fixes.

The first one is that with large folios, we can have order 6 folios which
reached our current BITS_PER_LONG limit, triggering a previously
impossible ASSERT(), which is based on the fact that our largest page
size (64K) can not reach BITS_PER_LONG blocks per page.

An easily fix by extending the ASSERT() condition to cover
blocks_per_folio == BITS_PER_LONG cases.

The second one is a little more complex, that with large folios, if we
still go through the single page bio vec iteration, we can not call
page_offset(), as non-head pages of a large folio do not have their
page::index initialized properly.

Fix that by going a helper using page_pgoff() to calculate the file
offset, which handles both head and non-head pages of a large folio.

Qu Wenruo (2):
  btrfs: fix the ASSERT() inside GET_SUBPAGE_BITMAP()
  btrfs: fix the file offset calculation inside
    btrfs_decompress_buf2page()

 fs/btrfs/compression.c | 18 +++++++++++++++++-
 fs/btrfs/subpage.c     |  2 +-
 2 files changed, 18 insertions(+), 2 deletions(-)

-- 
2.49.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] btrfs: fix the ASSERT() inside GET_SUBPAGE_BITMAP()
  2025-04-01  6:12 [PATCH 0/2] btrfs: two small and safe fixes for large folios Qu Wenruo
@ 2025-04-01  6:12 ` Qu Wenruo
  2025-04-01  6:12 ` [PATCH 2/2] btrfs: fix the file offset calculation inside btrfs_decompress_buf2page() Qu Wenruo
  1 sibling, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2025-04-01  6:12 UTC (permalink / raw)
  To: linux-btrfs

After enabling large data folios for tests, I hit the ASSERT() inside
GET_SUBPAGE_BITMAP() where blocks_per_folio matches BITS_PER_LONG.

The ASSERT() itself is only based on the original subpage fs block size,
where we have at most 16 blocks per page, thus
"ASSERT(blocks_per_folio < BITS_PER_LONG)".

However the experimental large data folio support will set the max folio
order according to the BITS_PER_LONG, so we can have a case where a large
folio contains exactly BITS_PER_LONG blocks.

So the ASSERT() is too strict, change to to
"ASSERT(blocks_per_folio <= BITS_PER_LONG)" to avoid the false alert.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/subpage.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
index 5fbdd977121e..d4f019233493 100644
--- a/fs/btrfs/subpage.c
+++ b/fs/btrfs/subpage.c
@@ -664,7 +664,7 @@ IMPLEMENT_BTRFS_PAGE_OPS(checked, folio_set_checked, folio_clear_checked,
 				btrfs_blocks_per_folio(fs_info, folio);	\
 	const struct btrfs_subpage *subpage = folio_get_private(folio);	\
 									\
-	ASSERT(blocks_per_folio < BITS_PER_LONG);			\
+	ASSERT(blocks_per_folio <= BITS_PER_LONG);			\
 	*dst = bitmap_read(subpage->bitmaps,				\
 			   blocks_per_folio * btrfs_bitmap_nr_##name,	\
 			   blocks_per_folio);				\
-- 
2.49.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] btrfs: fix the file offset calculation inside btrfs_decompress_buf2page()
  2025-04-01  6:12 [PATCH 0/2] btrfs: two small and safe fixes for large folios Qu Wenruo
  2025-04-01  6:12 ` [PATCH 1/2] btrfs: fix the ASSERT() inside GET_SUBPAGE_BITMAP() Qu Wenruo
@ 2025-04-01  6:12 ` Qu Wenruo
  2025-04-01  6:33   ` Sweet Tea Dorminy
  1 sibling, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2025-04-01  6:12 UTC (permalink / raw)
  To: linux-btrfs

[BUG WITH EXPERIMENTAL LARGE FOLIOS]
When testing the experimental large data folio support with compression,
there are several ASSERT()s triggered from btrfs_decompress_buf2page()
when running fsstress with compress=zstd mount option:

- ASSERT(copy_len) from btrfs_decompress_buf2page()
- VM_BUG_ON(offset + len > PAGE_SIZE) from memcpy_to_page()

[CAUSE]
Inside btrfs_decompress_buf2page(), we need to grab the file offset from
the current bvec.bv_page, to check if we even need to copy data into the
bio.

And since we're using single page bvec, and no large folio, every page
inside the folio should have its index properly setup.

But when large folios are involved, only the first page (aka, the head
page) of a large folio has its index properly initialized.

The other pages inside the large folio will not have their indexes
properly initialized.

Thus the page_offset() call inside btrfs_decompress_buf2page() will
result garbage, and completely screw up the @copy_len calculation.

[FIX]
Instead of using page->index directly, go with page_pgoff(), which can
handle non-head pages correctly.

So introduce a helper, file_offset_from_bvec(), to get the file offset
from a single page bio_vec, so the copy_len calculation can be done
correctly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/compression.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index e7f8ee5d48a4..ee70f086c884 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -1137,6 +1137,22 @@ void __cold btrfs_exit_compress(void)
 	bioset_exit(&btrfs_compressed_bioset);
 }
 
+/*
+ * The bvec is a single page bvec from a bio that contains folios from a filemap.
+ *
+ * Since the folios may be large one, and if the bv_page is not a head page of
+ * a large folio, then page->index is unreliable.
+ *
+ * Thus we need this helper to grab the proper file offset.
+ */
+static u64 file_offset_from_bvec(const struct bio_vec *bvec)
+{
+	const struct page *page = bvec->bv_page;
+	const struct folio *folio = page_folio(page);
+
+	return page_pgoff(folio, page) + bvec->bv_offset;
+}
+
 /*
  * Copy decompressed data from working buffer to pages.
  *
@@ -1188,7 +1204,7 @@ int btrfs_decompress_buf2page(const char *buf, u32 buf_len,
 		 * cb->start may underflow, but subtracting that value can still
 		 * give us correct offset inside the full decompressed extent.
 		 */
-		bvec_offset = page_offset(bvec.bv_page) + bvec.bv_offset - cb->start;
+		bvec_offset = file_offset_from_bvec(&bvec) - cb->start;
 
 		/* Haven't reached the bvec range, exit */
 		if (decompressed + buf_len <= bvec_offset)
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] btrfs: fix the file offset calculation inside btrfs_decompress_buf2page()
  2025-04-01  6:12 ` [PATCH 2/2] btrfs: fix the file offset calculation inside btrfs_decompress_buf2page() Qu Wenruo
@ 2025-04-01  6:33   ` Sweet Tea Dorminy
  2025-04-01  7:13     ` Qu Wenruo
  0 siblings, 1 reply; 5+ messages in thread
From: Sweet Tea Dorminy @ 2025-04-01  6:33 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 4/1/25 2:12 AM, Qu Wenruo wrote:
> +static u64 file_offset_from_bvec(const struct bio_vec *bvec)
> +{
> +	const struct page *page = bvec->bv_page;
> +	const struct folio *folio = page_folio(page);
> +
> +	return page_pgoff(folio, page) + bvec->bv_offset;
> +}

I think this needs to be page_pgoff() << PAGE_SHIFT + bvec->bv_offset: 
page_pgoff() returns in units of PAGE_SIZE, while bv_offset is in bytes?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] btrfs: fix the file offset calculation inside btrfs_decompress_buf2page()
  2025-04-01  6:33   ` Sweet Tea Dorminy
@ 2025-04-01  7:13     ` Qu Wenruo
  0 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2025-04-01  7:13 UTC (permalink / raw)
  To: Sweet Tea Dorminy, Qu Wenruo, linux-btrfs



在 2025/4/1 17:03, Sweet Tea Dorminy 写道:
>
>
> On 4/1/25 2:12 AM, Qu Wenruo wrote:
>> +static u64 file_offset_from_bvec(const struct bio_vec *bvec)
>> +{
>> +    const struct page *page = bvec->bv_page;
>> +    const struct folio *folio = page_folio(page);
>> +
>> +    return page_pgoff(folio, page) + bvec->bv_offset;
>> +}
>
> I think this needs to be page_pgoff() << PAGE_SHIFT + bvec->bv_offset:
> page_pgoff() returns in units of PAGE_SIZE, while bv_offset is in bytes?
>

Oh no, this must be some local change not committed, thanks for catching it.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-04-01  7:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-01  6:12 [PATCH 0/2] btrfs: two small and safe fixes for large folios Qu Wenruo
2025-04-01  6:12 ` [PATCH 1/2] btrfs: fix the ASSERT() inside GET_SUBPAGE_BITMAP() Qu Wenruo
2025-04-01  6:12 ` [PATCH 2/2] btrfs: fix the file offset calculation inside btrfs_decompress_buf2page() Qu Wenruo
2025-04-01  6:33   ` Sweet Tea Dorminy
2025-04-01  7:13     ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox