[PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large()
@ 2025-03-17  7:10 Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 1/9] btrfs: send: remove the again label inside put_file_data() Qu Wenruo
                   ` (9 more replies)
  0 siblings, 10 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

[CHANGELOG]
v3:
- Prepare btrfs_end_repair_bio() to support larger folios
  Unfortunately folio_iter structure is way too large compared to
  bvec_iter, thus it's not a good idea to convert bbio::saved_iter to
  folio_iter.

  Thankfully it's not that complex to grab the folio from a bio_vec.

- Add a new patch to prepare defrag for larger data folios

v2:
- Update the commit message for the first patch

There are quite some ASSERT()s checking folio_order() or
folio_test_large() on data folios.

They are the blockage for the incoming larger data folios.

This series will remove most of them, with the following exceptions
which are not a blockage for larger data folios.

- wait_dev_supers() and write_dev_supers()
  They are folios from block device cache, will not be affected by our
  larger data folios support.

- relocation_one_folio()
  It's about data relocation inode, and we can disable larger data folios
  for such special inode easily.

- btrfs_attach_subpage()
  The folio_test_large() is only for metadata, just change it to be a
  metadata specific check.

The first patch is just a small cleanup for send, exposed during the
removal of ASSERT()s.

The conversion for defrag is only tested to not break the existing
subpage and regular cases.
The full tests can only be done with larger data folios support enabled.

Although there is one real blockage before enabling larger data folios:

- btrfs_buffered_write()
  Currnetly we always reserved space first, then grab the folio.
  This makes it impossible to support variable sized folios.

  We have to align the behavior to iomap, which grab the folio first
  then determine the reserved space according to the folio size.

  This will touch the core function of btrfs and can not be hidden
  behind experimental flags.
  Thus that change will be a dedicated one and needs a lot of
  reviews/tests.

Qu Wenruo (9):
  btrfs: send: remove the again label inside put_file_data()
  btrfs: send: prepare put_file_data() for larger data folios
  btrfs: prepare btrfs_page_mkwrite() for larger data folios
  btrfs: prepare prepare_one_folio() for larger data folios
  btrfs: prepare end_bbio_data_write() for larger data folios
  btrfs: subpage: prepare for larger data folios
  btrfs: zlib: prepare copy_data_into_buffer() for larger data folios
  btrfs: prepare btrfs_end_repair_bio() for larger data folios
  btrfs: enable larger data folios support for defrag

 fs/btrfs/bio.c       | 28 ++++++++++++-----
 fs/btrfs/defrag.c    | 72 ++++++++++++++++++++++++++------------------
 fs/btrfs/extent_io.c |  3 --
 fs/btrfs/file.c      |  6 +---
 fs/btrfs/send.c      | 29 ++++++++----------
 fs/btrfs/subpage.c   |  6 ++--
 fs/btrfs/zlib.c      |  2 --
 7 files changed, 79 insertions(+), 67 deletions(-)

-- 
2.48.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/9] btrfs: send: remove the again label inside put_file_data()
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 2/9] btrfs: send: prepare put_file_data() for larger data folios Qu Wenruo
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

The again label is here to retry to get the folio for the current index.
When triggering that label, there is no increasement on the iterator.

So it can be replaced by a simple "continue" and remove the again label.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/send.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 0c8c58c4f29b..43c29295f477 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -5280,7 +5280,6 @@ static int put_file_data(struct send_ctx *sctx, u64 offset, u32 len)
 		unsigned cur_len = min_t(unsigned, len,
 					 PAGE_SIZE - pg_offset);
 
-again:
 		folio = filemap_lock_folio(mapping, index);
 		if (IS_ERR(folio)) {
 			page_cache_sync_readahead(mapping,
@@ -5316,7 +5315,7 @@ static int put_file_data(struct send_ctx *sctx, u64 offset, u32 len)
 			if (folio->mapping != mapping) {
 				folio_unlock(folio);
 				folio_put(folio);
-				goto again;
+				continue;
 			}
 		}
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/9] btrfs: send: prepare put_file_data() for larger data folios
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 1/9] btrfs: send: remove the again label inside put_file_data() Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 3/9] btrfs: prepare btrfs_page_mkwrite() " Qu Wenruo
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

Currently the function put_file_data() can only accept page sized folio.

However the function itself is not that complex, it's just copying data
from filemap folio into send buffer.

So make it support larger data folios by:

- Change the loop to use file offset instead of page index

- Calculate @pg_offset and @cur_len after getting the folio

- Remove the "WARN_ON(folio_order(folio));" line

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/send.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 43c29295f477..4df07dfe326f 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -5263,10 +5263,9 @@ static int put_file_data(struct send_ctx *sctx, u64 offset, u32 len)
 {
 	struct btrfs_root *root = sctx->send_root;
 	struct btrfs_fs_info *fs_info = root->fs_info;
-	struct folio *folio;
-	pgoff_t index = offset >> PAGE_SHIFT;
-	pgoff_t last_index;
-	unsigned pg_offset = offset_in_page(offset);
+	u64 cur = offset;
+	const u64 end = offset + len;
+	const pgoff_t last_index = (end - 1) >> PAGE_SHIFT;
 	struct address_space *mapping = sctx->cur_inode->i_mapping;
 	int ret;
 
@@ -5274,11 +5273,11 @@ static int put_file_data(struct send_ctx *sctx, u64 offset, u32 len)
 	if (ret)
 		return ret;
 
-	last_index = (offset + len - 1) >> PAGE_SHIFT;
-
-	while (index <= last_index) {
-		unsigned cur_len = min_t(unsigned, len,
-					 PAGE_SIZE - pg_offset);
+	while (cur < end) {
+		pgoff_t index = cur >> PAGE_SHIFT;
+		unsigned int cur_len;
+		unsigned int pg_offset;
+		struct folio *folio;
 
 		folio = filemap_lock_folio(mapping, index);
 		if (IS_ERR(folio)) {
@@ -5292,8 +5291,9 @@ static int put_file_data(struct send_ctx *sctx, u64 offset, u32 len)
 				break;
 			}
 		}
-
-		WARN_ON(folio_order(folio));
+		pg_offset = offset_in_folio(folio, cur);
+		cur_len = min_t(unsigned int, end - cur,
+				folio_size(folio) - pg_offset);
 
 		if (folio_test_readahead(folio))
 			page_cache_async_readahead(mapping, &sctx->ra, NULL, folio,
@@ -5323,9 +5323,7 @@ static int put_file_data(struct send_ctx *sctx, u64 offset, u32 len)
 				  pg_offset, cur_len);
 		folio_unlock(folio);
 		folio_put(folio);
-		index++;
-		pg_offset = 0;
-		len -= cur_len;
+		cur += cur_len;
 		sctx->send_size += cur_len;
 	}
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/9] btrfs: prepare btrfs_page_mkwrite() for larger data folios
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 1/9] btrfs: send: remove the again label inside put_file_data() Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 2/9] btrfs: send: prepare put_file_data() for larger data folios Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 4/9] btrfs: prepare prepare_one_folio() " Qu Wenruo
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

The function btrfs_page_mkwrite() has an explicit ASSERT() checking the
folio order.

To make it support larger data folios, we need to:

- Remove the ASSERT(folio_order(folio) == 0)

- Use folio_contains() to check if the folio covers the last page

Otherwise the code is already supporting larger folios well.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/file.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 262a707d8990..4213807982d6 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1791,8 +1791,6 @@ static vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
 	u64 page_end;
 	u64 end;
 
-	ASSERT(folio_order(folio) == 0);
-
 	reserved_space = fsize;
 
 	sb_start_pagefault(inode->i_sb);
@@ -1857,7 +1855,7 @@ static vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf)
 		goto again;
 	}
 
-	if (folio->index == ((size - 1) >> PAGE_SHIFT)) {
+	if (folio_contains(folio, (size - 1) >> PAGE_SHIFT)) {
 		reserved_space = round_up(size - page_start, fs_info->sectorsize);
 		if (reserved_space < fsize) {
 			end = page_start + reserved_space - 1;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 4/9] btrfs: prepare prepare_one_folio() for larger data folios
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
                   ` (2 preceding siblings ...)
  2025-03-17  7:10 ` [PATCH v3 3/9] btrfs: prepare btrfs_page_mkwrite() " Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 5/9] btrfs: prepare end_bbio_data_write() " Qu Wenruo
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

The only blockage is the ASSERT() rejecting larger folios, just remove
it.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/file.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 4213807982d6..c2648858772a 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -875,8 +875,6 @@ static noinline int prepare_one_folio(struct inode *inode, struct folio **folio_
 			ret = PTR_ERR(folio);
 		return ret;
 	}
-	/* Only support page sized folio yet. */
-	ASSERT(folio_order(folio) == 0);
 	ret = set_folio_extent_mapped(folio);
 	if (ret < 0) {
 		folio_unlock(folio);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 5/9] btrfs: prepare end_bbio_data_write() for larger data folios
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
                   ` (3 preceding siblings ...)
  2025-03-17  7:10 ` [PATCH v3 4/9] btrfs: prepare prepare_one_folio() " Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 6/9] btrfs: subpage: prepare " Qu Wenruo
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

The function is doing an ASSERT() checking the folio order,  but all
later functions are handling larger folios properly, thus we can safely
remove that ASSERT().

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_io.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 7db74a173b77..d5d4f9b06309 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -462,9 +462,6 @@ static void end_bbio_data_write(struct btrfs_bio *bbio)
 		u64 start = folio_pos(folio) + fi.offset;
 		u32 len = fi.length;
 
-		/* Only order 0 (single page) folios are allowed for data. */
-		ASSERT(folio_order(folio) == 0);
-
 		/* Our read/write should always be sector aligned. */
 		if (!IS_ALIGNED(fi.offset, sectorsize))
 			btrfs_err(fs_info,
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 6/9] btrfs: subpage: prepare for larger data folios
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
                   ` (4 preceding siblings ...)
  2025-03-17  7:10 ` [PATCH v3 5/9] btrfs: prepare end_bbio_data_write() " Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 7/9] btrfs: zlib: prepare copy_data_into_buffer() " Qu Wenruo
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

The subpage handling code has two locations not supporting larger
folios:

- btrfs_attach_subpage()
  Which is doing a metadata specific ASSERT() check.

  But for the future larger data folios support, that check is too
  generic.
  Since it's metadata specific, only check the ASSERT() for metadata.

- btrfs_subpage_assert()
  Just remove the "ASSERT(folio_order(folio) == 0)" check.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/subpage.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
index 834161f35a00..bf7fd50ab9ec 100644
--- a/fs/btrfs/subpage.c
+++ b/fs/btrfs/subpage.c
@@ -69,7 +69,8 @@ int btrfs_attach_subpage(const struct btrfs_fs_info *fs_info,
 	struct btrfs_subpage *subpage;
 
 	/* For metadata we don't support larger folio yet. */
-	ASSERT(!folio_test_large(folio));
+	if (type == BTRFS_SUBPAGE_METADATA)
+		ASSERT(!folio_test_large(folio));
 
 	/*
 	 * We have cases like a dummy extent buffer page, which is not mapped
@@ -181,9 +182,6 @@ void btrfs_folio_dec_eb_refs(const struct btrfs_fs_info *fs_info, struct folio *
 static void btrfs_subpage_assert(const struct btrfs_fs_info *fs_info,
 				 struct folio *folio, u64 start, u32 len)
 {
-	/* For subpage support, the folio must be single page. */
-	ASSERT(folio_order(folio) == 0);
-
 	/* Basic checks */
 	ASSERT(folio_test_private(folio) && folio_get_private(folio));
 	ASSERT(IS_ALIGNED(start, fs_info->sectorsize) &&
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 7/9] btrfs: zlib: prepare copy_data_into_buffer() for larger data folios
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
                   ` (5 preceding siblings ...)
  2025-03-17  7:10 ` [PATCH v3 6/9] btrfs: subpage: prepare " Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-03-17  7:10 ` [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() " Qu Wenruo
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

The function itself is already taking larger folios into consideration,
just remove the ASSERT(!folio_test_large()) line.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/zlib.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
index 3a7d136f57b4..b32aa05b288e 100644
--- a/fs/btrfs/zlib.c
+++ b/fs/btrfs/zlib.c
@@ -120,8 +120,6 @@ static int copy_data_into_buffer(struct address_space *mapping,
 		ret = btrfs_compress_filemap_get_folio(mapping, cur, &folio);
 		if (ret < 0)
 			return ret;
-		/* No larger folio support yet. */
-		ASSERT(!folio_test_large(folio));
 
 		offset = offset_in_folio(folio, cur);
 		copy_length = min(folio_size(folio) - offset,
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() for larger data folios
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
                   ` (6 preceding siblings ...)
  2025-03-17  7:10 ` [PATCH v3 7/9] btrfs: zlib: prepare copy_data_into_buffer() " Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-04-03 19:56   ` David Sterba
                     ` (2 more replies)
  2025-03-17  7:10 ` [PATCH v3 9/9] btrfs: enable larger data folios support for defrag Qu Wenruo
  2025-03-17 13:55 ` [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() David Sterba
  9 siblings, 3 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

The function btrfs_end_repair_bio() has an ASSERT() making sure the
folio is page sized.

The reason is mostly related to the fact that later we pass a folio and
its offset into btrfs_repair_io_failure().
If we have larger folios passed in, later calculation of the folio and
its offset can go wrong, as we have extra offset to the bv_page.

Change the behavior by:

- Doing a proper folio grab
  Instead of just page_folio(bv_page), we should get the real page (as the
  bv_offset can be larger than page size), then call page_folio().

- Do extra folio offset calculation
  We can have the following cases of a bio_vec (this bv is moved
  forward by btrfs read verification):

  bv_page             bv_offset
  |                   |
  | | | | | | | | | | | | | | | | |
  |<-  folio_a  ->|<-  folio_b  ->|

  | | = a page.

In above case, the real folio should be folio_b, and offset inside that
folio should be:

  bv_offset - ((&folio_b->page - &folio_a->page) << PAGE_SHIFT).

With these changes, now btrfs_end_repair_bio() is able to handle larger
folios properly.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/bio.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index 8c2eee1f1878..292c79e0855f 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -156,6 +156,25 @@ static void btrfs_repair_done(struct btrfs_failed_bio *fbio)
 	}
 }
 
+static struct folio *bio_vec_get_folio(const struct bio_vec *bv)
+{
+	return page_folio(bv->bv_page + (bv->bv_offset >> PAGE_SHIFT));
+}
+
+static unsigned long bio_vec_get_folio_offset(const struct bio_vec *bv)
+{
+	struct folio *folio = bio_vec_get_folio(bv);
+
+	/*
+	 * There can be multiple physically contiguous folios queued
+	 * into the bio_vec.
+	 * Thus the first page of our folio should be at or beyond
+	 * the first page of the bio_vec.
+	 */
+	ASSERT(&folio->page >= bv->bv_page);
+	return bv->bv_offset - ((&folio->page - bv->bv_page) << PAGE_SHIFT);
+}
+
 static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
 				 struct btrfs_device *dev)
 {
@@ -165,12 +184,6 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
 	struct bio_vec *bv = bio_first_bvec_all(&repair_bbio->bio);
 	int mirror = repair_bbio->mirror_num;
 
-	/*
-	 * We can only trigger this for data bio, which doesn't support larger
-	 * folios yet.
-	 */
-	ASSERT(folio_order(page_folio(bv->bv_page)) == 0);
-
 	if (repair_bbio->bio.bi_status ||
 	    !btrfs_data_csum_ok(repair_bbio, dev, 0, bv)) {
 		bio_reset(&repair_bbio->bio, NULL, REQ_OP_READ);
@@ -192,7 +205,8 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
 		btrfs_repair_io_failure(fs_info, btrfs_ino(inode),
 				  repair_bbio->file_offset, fs_info->sectorsize,
 				  repair_bbio->saved_iter.bi_sector << SECTOR_SHIFT,
-				  page_folio(bv->bv_page), bv->bv_offset, mirror);
+				  bio_vec_get_folio(bv), bio_vec_get_folio_offset(bv),
+				  mirror);
 	} while (mirror != fbio->bbio->mirror_num);
 
 done:
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() for larger data folios
  2025-03-17  7:10 ` [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() " Qu Wenruo
@ 2025-04-03 19:56   ` David Sterba
  2025-04-06 23:19   ` Qu Wenruo
  2025-04-07  6:18   ` Christoph Hellwig
  2 siblings, 0 replies; 19+ messages in thread
From: David Sterba @ 2025-04-03 19:56 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Mar 17, 2025 at 05:40:53PM +1030, Qu Wenruo wrote:
> The function btrfs_end_repair_bio() has an ASSERT() making sure the
> folio is page sized.
> 
> The reason is mostly related to the fact that later we pass a folio and
> its offset into btrfs_repair_io_failure().
> If we have larger folios passed in, later calculation of the folio and
> its offset can go wrong, as we have extra offset to the bv_page.
> 
> Change the behavior by:
> 
> - Doing a proper folio grab
>   Instead of just page_folio(bv_page), we should get the real page (as the
>   bv_offset can be larger than page size), then call page_folio().
> 
> - Do extra folio offset calculation
>   We can have the following cases of a bio_vec (this bv is moved
>   forward by btrfs read verification):
> 
>   bv_page             bv_offset
>   |                   |
>   | | | | | | | | | | | | | | | | |
>   |<-  folio_a  ->|<-  folio_b  ->|
> 
>   | | = a page.
> 
> In above case, the real folio should be folio_b, and offset inside that
> folio should be:
> 
>   bv_offset - ((&folio_b->page - &folio_a->page) << PAGE_SHIFT).
> 
> With these changes, now btrfs_end_repair_bio() is able to handle larger
> folios properly.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/bio.c | 28 +++++++++++++++++++++-------
>  1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
> index 8c2eee1f1878..292c79e0855f 100644
> --- a/fs/btrfs/bio.c
> +++ b/fs/btrfs/bio.c
> @@ -156,6 +156,25 @@ static void btrfs_repair_done(struct btrfs_failed_bio *fbio)
>  	}
>  }
>  
> +static struct folio *bio_vec_get_folio(const struct bio_vec *bv)
> +{
> +	return page_folio(bv->bv_page + (bv->bv_offset >> PAGE_SHIFT));
> +}
> +
> +static unsigned long bio_vec_get_folio_offset(const struct bio_vec *bv)
> +{
> +	struct folio *folio = bio_vec_get_folio(bv);
> +
> +	/*
> +	 * There can be multiple physically contiguous folios queued
> +	 * into the bio_vec.
> +	 * Thus the first page of our folio should be at or beyond
> +	 * the first page of the bio_vec.
> +	 */
> +	ASSERT(&folio->page >= bv->bv_page);
> +	return bv->bv_offset - ((&folio->page - bv->bv_page) << PAGE_SHIFT);

This looks like it should be a generic helper, the expression looks
quite magic.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() for larger data folios
  2025-03-17  7:10 ` [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() " Qu Wenruo
  2025-04-03 19:56   ` David Sterba
@ 2025-04-06 23:19   ` Qu Wenruo
  2025-04-07  6:18   ` Christoph Hellwig
  2 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-04-06 23:19 UTC (permalink / raw)
  To: linux-btrfs



在 2025/3/17 17:40, Qu Wenruo 写道:
> The function btrfs_end_repair_bio() has an ASSERT() making sure the
> folio is page sized.
> 
> The reason is mostly related to the fact that later we pass a folio and
> its offset into btrfs_repair_io_failure().
> If we have larger folios passed in, later calculation of the folio and
> its offset can go wrong, as we have extra offset to the bv_page.
> 
> Change the behavior by:
> 
> - Doing a proper folio grab
>    Instead of just page_folio(bv_page), we should get the real page (as the
>    bv_offset can be larger than page size), then call page_folio().
> 
> - Do extra folio offset calculation
>    We can have the following cases of a bio_vec (this bv is moved
>    forward by btrfs read verification):
> 
>    bv_page             bv_offset
>    |                   |
>    | | | | | | | | | | | | | | | | |
>    |<-  folio_a  ->|<-  folio_b  ->|
> 
>    | | = a page.
> 
> In above case, the real folio should be folio_b, and offset inside that
> folio should be:
> 
>    bv_offset - ((&folio_b->page - &folio_a->page) << PAGE_SHIFT).

This part is a little confusing and resulted an incorrect calculation 
and ASSERT().

The more accurate (with more corner cases) example to calculate the 
offset would be something like this:

                             real_page
           bv_page           |   bv_offset (10K)
           |                 |   |
           v                 v   v
           |        |        |       |
           |<- F1 ->|<--- Folio 2 -->|
                    | result off |

In that case, we only need to care about one page pointer, the 
@real_page, which is (bv_page + (bv_offset >> PAGE_SHIFT)).

Meanwhile the offset inside the folio 2 is consisted of two part:

- The offset between folio2's head page and real_page
   We have an existing helper, folio_page_idx(), to do the calculation.
   So this part is just (folio_page_idx() << PAGE_SHIFT).

- The offset inside that @real_page
   Since it's now page based, we do not need to bother folio size, just
   offset_in_page() will do the work.

So the resulted value should be:
   (folio_page_idx() << PAGE_SHIFT) + offset_in_page(bv_offset)

Furthermore, the above cases shows that we can not use offset_in_folio() 
either, as the previous folio is no in the same order.
(offset_in_folio() will result 2K, not the correct 6K).

I'll send out an updated series with this fixed, and enable large folios 
for testing.

Thanks,
Qu

> 
> With these changes, now btrfs_end_repair_bio() is able to handle larger
> folios properly.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/bio.c | 28 +++++++++++++++++++++-------
>   1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
> index 8c2eee1f1878..292c79e0855f 100644
> --- a/fs/btrfs/bio.c
> +++ b/fs/btrfs/bio.c
> @@ -156,6 +156,25 @@ static void btrfs_repair_done(struct btrfs_failed_bio *fbio)
>   	}
>   }
>   
> +static struct folio *bio_vec_get_folio(const struct bio_vec *bv)
> +{
> +	return page_folio(bv->bv_page + (bv->bv_offset >> PAGE_SHIFT));
> +}
> +
> +static unsigned long bio_vec_get_folio_offset(const struct bio_vec *bv)
> +{
> +	struct folio *folio = bio_vec_get_folio(bv);
> +
> +	/*
> +	 * There can be multiple physically contiguous folios queued
> +	 * into the bio_vec.
> +	 * Thus the first page of our folio should be at or beyond
> +	 * the first page of the bio_vec.
> +	 */
> +	ASSERT(&folio->page >= bv->bv_page);
> +	return bv->bv_offset - ((&folio->page - bv->bv_page) << PAGE_SHIFT);
> +}
> +
>   static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
>   				 struct btrfs_device *dev)
>   {
> @@ -165,12 +184,6 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
>   	struct bio_vec *bv = bio_first_bvec_all(&repair_bbio->bio);
>   	int mirror = repair_bbio->mirror_num;
>   
> -	/*
> -	 * We can only trigger this for data bio, which doesn't support larger
> -	 * folios yet.
> -	 */
> -	ASSERT(folio_order(page_folio(bv->bv_page)) == 0);
> -
>   	if (repair_bbio->bio.bi_status ||
>   	    !btrfs_data_csum_ok(repair_bbio, dev, 0, bv)) {
>   		bio_reset(&repair_bbio->bio, NULL, REQ_OP_READ);
> @@ -192,7 +205,8 @@ static void btrfs_end_repair_bio(struct btrfs_bio *repair_bbio,
>   		btrfs_repair_io_failure(fs_info, btrfs_ino(inode),
>   				  repair_bbio->file_offset, fs_info->sectorsize,
>   				  repair_bbio->saved_iter.bi_sector << SECTOR_SHIFT,
> -				  page_folio(bv->bv_page), bv->bv_offset, mirror);
> +				  bio_vec_get_folio(bv), bio_vec_get_folio_offset(bv),
> +				  mirror);
>   	} while (mirror != fbio->bbio->mirror_num);
>   
>   done:


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() for larger data folios
  2025-03-17  7:10 ` [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() " Qu Wenruo
  2025-04-03 19:56   ` David Sterba
  2025-04-06 23:19   ` Qu Wenruo
@ 2025-04-07  6:18   ` Christoph Hellwig
  2025-04-07  7:11     ` Qu Wenruo
  2 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2025-04-07  6:18 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Mar 17, 2025 at 05:40:53PM +1030, Qu Wenruo wrote:
> In above case, the real folio should be folio_b, and offset inside that
> folio should be:
> 
>   bv_offset - ((&folio_b->page - &folio_a->page) << PAGE_SHIFT).
> 
> With these changes, now btrfs_end_repair_bio() is able to handle larger
> folios properly.

Please stop messing with internals like bv_offset and bv_page entirely
if you can, as that makes the pending conversion of the bio_vec to store
a physical address much harder.

Looking at the code the best way to handle this would be to simply
split btrfs_repair_io_failure so that there is a helper for the code
before the bio_init call.  btrfs_repair_eb_io_failure (or a new helper
called by it) then open codes the existing logic using bio_add_folio
(it could in fact use bio_add_folio_nofail instead), while
btrfs_end_repair_bio should just copy the existing bio_vec over
using an assignment or mempcy.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() for larger data folios
  2025-04-07  6:18   ` Christoph Hellwig
@ 2025-04-07  7:11     ` Qu Wenruo
  0 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-04-07  7:11 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-btrfs

在 2025/4/7 15:48, Christoph Hellwig 写道:
> On Mon, Mar 17, 2025 at 05:40:53PM +1030, Qu Wenruo wrote:
>> In above case, the real folio should be folio_b, and offset inside that
>> folio should be:
>>
>>    bv_offset - ((&folio_b->page - &folio_a->page) << PAGE_SHIFT).
>>
>> With these changes, now btrfs_end_repair_bio() is able to handle larger
>> folios properly.
> 
> Please stop messing with internals like bv_offset and bv_page entirely
> if you can, as that makes the pending conversion of the bio_vec to store
> a physical address much harder.

Well, I know this version is bad and it is already causing bugs when I 
enabled large folios.

I have a better solution in the latest version to calculate the offset 
inside the folio, but not sure if it still counts as messing with internals:

	struct page *real_page = bv->bv_page +
				 (bv->bv_offset >> PAGE_SHIFT);
	struct folio *folio = page_folio(real_page);

	return (folio_page_idx(folio, real_page) << PAGE_SHIFT) +
		offset_in_page(bv->bv_offset);

The latest version can be found here just for reference, with the corner 
case taken into consideration.

https://lore.kernel.org/linux-btrfs/f679cbeedbb0132cc657b4db7a1d3d70ff2261f0.1744005845.git.wqu@suse.com/T/#u

> 
> Looking at the code the best way to handle this would be to simply
> split btrfs_repair_io_failure so that there is a helper for the code
> before the bio_init call.  btrfs_repair_eb_io_failure (or a new helper
> called by it) then open codes the existing logic using bio_add_folio
> (it could in fact use bio_add_folio_nofail instead), while
> btrfs_end_repair_bio should just copy the existing bio_vec over
> using an assignment or mempcy.

I guess "btrfs_repair_eb_io_failure" here is just a typo, as we're only 
talking about data read-repair.

Mind to give some pseudo code example? As I still didn't catch all the 
points.

There are quite some direct bv_page/bv_len/bv_offset usage inside 
fs/btrfs/bio.c, and to my uneducated eyes it looks like we will need a 
way to convert bio_vec to folio+offset anyway, as long as we're still 
using bio_vec as btrfs_bio::saved_iter.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 9/9] btrfs: enable larger data folios support for defrag
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
                   ` (7 preceding siblings ...)
  2025-03-17  7:10 ` [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() " Qu Wenruo
@ 2025-03-17  7:10 ` Qu Wenruo
  2025-03-17 13:55 ` [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() David Sterba
  9 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-03-17  7:10 UTC (permalink / raw)
  To: linux-btrfs

Currently we rejects larger folios for defrag gracefully, but the
implementation itself is already mostly larger folios compatible.

There are several parts of defrag in btrfs:

- Extent map checking
  Aka, defrag_collect_targets(), which prepare a list of target ranges
  that should be defragged.

  This part is completely folio unrelated, thus it doesn't care about if
  the folio is larger or not.

- Target folio preparation
  Aka, defrag_prepare_one_folio(), which lock and read (if needed) the
  target folio.

  Since folio read and lock are already supporting larger folios, we can
  easily support this part.

- Redirty the target range of the folio
  This is already done in a way supporting larger folios.

So it's pretty straightforward to enable larger folios for defrag:

- Do not reject larger folios for experimental builds
  This affects the larger folio check inside defrag_prepare_one_folio().

- Wait for ordered extents of the whole folio in
  defrag_prepare_one_folio()

- Lock the whole extent range for all involved folios in
  defrag_one_range()

- Allow the folios[] array to be partially empty
  Since we can have larger folios, folios[] will not always be full.

  This affects:
  * How to allocate folios in defrag_one_range()
    Now we can not use page index, but use the end position of the folio
    as an iterator.

  * How to free the folios[] array
    If we hit an empty slot, it means we have larger folios and already
    hit the end of the array.

  * How to mark the range dirty
    Instead of use page index directly, we have to go through each
    folio.

Unfortunately the behavior can only be verified with larger data folio
support enabled, the current change is only ensured that it doesn't
break the existing defrag behavior for regular and subpage cases.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/defrag.c | 72 +++++++++++++++++++++++++++--------------------
 1 file changed, 42 insertions(+), 30 deletions(-)

diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c
index d4310d93f532..f2fa8b5c64b5 100644
--- a/fs/btrfs/defrag.c
+++ b/fs/btrfs/defrag.c
@@ -857,13 +857,14 @@ static struct folio *defrag_prepare_one_folio(struct btrfs_inode *inode, pgoff_t
 {
 	struct address_space *mapping = inode->vfs_inode.i_mapping;
 	gfp_t mask = btrfs_alloc_write_mask(mapping);
-	u64 page_start = (u64)index << PAGE_SHIFT;
-	u64 page_end = page_start + PAGE_SIZE - 1;
+	u64 folio_start;
+	u64 folio_end;
 	struct extent_state *cached_state = NULL;
 	struct folio *folio;
 	int ret;
 
 again:
+	/* TODO: Add order fgp order flags when larger folios are fully enabled. */
 	folio = __filemap_get_folio(mapping, index,
 				    FGP_LOCK | FGP_ACCESSED | FGP_CREAT, mask);
 	if (IS_ERR(folio))
@@ -871,13 +872,16 @@ static struct folio *defrag_prepare_one_folio(struct btrfs_inode *inode, pgoff_t
 
 	/*
 	 * Since we can defragment files opened read-only, we can encounter
-	 * transparent huge pages here (see CONFIG_READ_ONLY_THP_FOR_FS). We
-	 * can't do I/O using huge pages yet, so return an error for now.
+	 * transparent huge pages here (see CONFIG_READ_ONLY_THP_FOR_FS).
+	 *
+	 * The IO for such larger folios are not fully tested, thus return
+	 * an error to reject such folios unless it's an experimental build.
+	 *
 	 * Filesystem transparent huge pages are typically only used for
 	 * executables that explicitly enable them, so this isn't very
 	 * restrictive.
 	 */
-	if (folio_test_large(folio)) {
+	if (!IS_ENABLED(CONFIG_BTRFS_EXPERIMENTAL) && folio_test_large(folio)) {
 		folio_unlock(folio);
 		folio_put(folio);
 		return ERR_PTR(-ETXTBSY);
@@ -890,13 +894,15 @@ static struct folio *defrag_prepare_one_folio(struct btrfs_inode *inode, pgoff_t
 		return ERR_PTR(ret);
 	}
 
+	folio_start = folio_pos(folio);
+	folio_end = folio_pos(folio) + folio_size(folio) - 1;
 	/* Wait for any existing ordered extent in the range */
 	while (1) {
 		struct btrfs_ordered_extent *ordered;
 
-		lock_extent(&inode->io_tree, page_start, page_end, &cached_state);
-		ordered = btrfs_lookup_ordered_range(inode, page_start, PAGE_SIZE);
-		unlock_extent(&inode->io_tree, page_start, page_end,
+		lock_extent(&inode->io_tree, folio_start, folio_end, &cached_state);
+		ordered = btrfs_lookup_ordered_range(inode, folio_start, folio_size(folio));
+		unlock_extent(&inode->io_tree, folio_start, folio_end,
 			      &cached_state);
 		if (!ordered)
 			break;
@@ -1162,13 +1168,7 @@ static int defrag_one_locked_target(struct btrfs_inode *inode,
 	struct extent_changeset *data_reserved = NULL;
 	const u64 start = target->start;
 	const u64 len = target->len;
-	unsigned long last_index = (start + len - 1) >> PAGE_SHIFT;
-	unsigned long start_index = start >> PAGE_SHIFT;
-	unsigned long first_index = folios[0]->index;
 	int ret = 0;
-	int i;
-
-	ASSERT(last_index - first_index + 1 <= nr_pages);
 
 	ret = btrfs_delalloc_reserve_space(inode, &data_reserved, start, len);
 	if (ret < 0)
@@ -1179,10 +1179,17 @@ static int defrag_one_locked_target(struct btrfs_inode *inode,
 	set_extent_bit(&inode->io_tree, start, start + len - 1,
 		       EXTENT_DELALLOC | EXTENT_DEFRAG, cached_state);
 
-	/* Update the page status */
-	for (i = start_index - first_index; i <= last_index - first_index; i++) {
-		folio_clear_checked(folios[i]);
-		btrfs_folio_clamp_set_dirty(fs_info, folios[i], start, len);
+	/*
+	 * Update the page status.
+	 * Due to possible larger folios, we have to check all folios one by one.
+	 * And the btrfs_folio_clamp_*() helpers can handle ranges out of the
+	 * folio cases well.
+	 */
+	for (int i = 0; i < nr_pages && folios[i]; i++) {
+		struct folio *folio = folios[i];
+
+		btrfs_folio_clamp_clear_checked(fs_info, folio, start, len);
+		btrfs_folio_clamp_set_dirty(fs_info, folio, start, len);
 	}
 	btrfs_delalloc_release_extents(inode, len);
 	extent_changeset_free(data_reserved);
@@ -1200,9 +1207,9 @@ static int defrag_one_range(struct btrfs_inode *inode, u64 start, u32 len,
 	LIST_HEAD(target_list);
 	struct folio **folios;
 	const u32 sectorsize = inode->root->fs_info->sectorsize;
-	u64 last_index = (start + len - 1) >> PAGE_SHIFT;
-	u64 start_index = start >> PAGE_SHIFT;
-	unsigned int nr_pages = last_index - start_index + 1;
+	u64 cur = start;
+	const unsigned int nr_pages = ((start + len - 1) >> PAGE_SHIFT) -
+				      (start >> PAGE_SHIFT) + 1;
 	int ret = 0;
 	int i;
 
@@ -1214,21 +1221,25 @@ static int defrag_one_range(struct btrfs_inode *inode, u64 start, u32 len,
 		return -ENOMEM;
 
 	/* Prepare all pages */
-	for (i = 0; i < nr_pages; i++) {
-		folios[i] = defrag_prepare_one_folio(inode, start_index + i);
+	for (i = 0; cur < start + len && i < nr_pages; i++) {
+		folios[i] = defrag_prepare_one_folio(inode, cur >> PAGE_SHIFT);
 		if (IS_ERR(folios[i])) {
 			ret = PTR_ERR(folios[i]);
-			nr_pages = i;
+			folios[i] = NULL;
 			goto free_folios;
 		}
+		cur = folio_pos(folios[i]) + folio_size(folios[i]);
 	}
-	for (i = 0; i < nr_pages; i++)
+	for (i = 0; i < nr_pages; i++) {
+		if (!folios[i])
+			break;
 		folio_wait_writeback(folios[i]);
+	}
 
-	/* Lock the pages range */
-	lock_extent(&inode->io_tree, start_index << PAGE_SHIFT,
-		    (last_index << PAGE_SHIFT) + PAGE_SIZE - 1,
+	/* Lock the folios[] range */
+	lock_extent(&inode->io_tree, folio_pos(folios[0]), cur - 1,
 		    &cached_state);
+
 	/*
 	 * Now we have a consistent view about the extent map, re-check
 	 * which range really needs to be defragged.
@@ -1254,11 +1265,12 @@ static int defrag_one_range(struct btrfs_inode *inode, u64 start, u32 len,
 		kfree(entry);
 	}
 unlock_extent:
-	unlock_extent(&inode->io_tree, start_index << PAGE_SHIFT,
-		      (last_index << PAGE_SHIFT) + PAGE_SIZE - 1,
+	unlock_extent(&inode->io_tree, folio_pos(folios[0]), cur - 1,
 		      &cached_state);
 free_folios:
 	for (i = 0; i < nr_pages; i++) {
+		if (!folios[i])
+			break;
 		folio_unlock(folios[i]);
 		folio_put(folios[i]);
 	}
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large()
  2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
                   ` (8 preceding siblings ...)
  2025-03-17  7:10 ` [PATCH v3 9/9] btrfs: enable larger data folios support for defrag Qu Wenruo
@ 2025-03-17 13:55 ` David Sterba
  2025-03-17 15:13   ` David Sterba
  9 siblings, 1 reply; 19+ messages in thread
From: David Sterba @ 2025-03-17 13:55 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Mar 17, 2025 at 05:40:45PM +1030, Qu Wenruo wrote:
> [CHANGELOG]
> v3:
> - Prepare btrfs_end_repair_bio() to support larger folios
>   Unfortunately folio_iter structure is way too large compared to
>   bvec_iter, thus it's not a good idea to convert bbio::saved_iter to
>   folio_iter.
> 
>   Thankfully it's not that complex to grab the folio from a bio_vec.
> 
> - Add a new patch to prepare defrag for larger data folios

I was not expecting v3 as the series was in for-next so I did some edits
to changelogs, namely changing 'larger folios' -> 'large folios' as this
is how it's called in MM. Although technically the folios are larger I'd
like to keep using the same terminology.

There are new patches so feel free to replace the whole series, I'm
going to do a pass over the whole branch anyway so will fix anything
thats left. Right now it's the last chance to get the patches to 6.15 so
I don't want delay it on your side.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large()
  2025-03-17 13:55 ` [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() David Sterba
@ 2025-03-17 15:13   ` David Sterba
  2025-03-31  4:47     ` Qu Wenruo
  0 siblings, 1 reply; 19+ messages in thread
From: David Sterba @ 2025-03-17 15:13 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenruo, linux-btrfs

On Mon, Mar 17, 2025 at 02:55:02PM +0100, David Sterba wrote:
> On Mon, Mar 17, 2025 at 05:40:45PM +1030, Qu Wenruo wrote:
> > [CHANGELOG]
> > v3:
> > - Prepare btrfs_end_repair_bio() to support larger folios
> >   Unfortunately folio_iter structure is way too large compared to
> >   bvec_iter, thus it's not a good idea to convert bbio::saved_iter to
> >   folio_iter.
> > 
> >   Thankfully it's not that complex to grab the folio from a bio_vec.
> > 
> > - Add a new patch to prepare defrag for larger data folios
> 
> I was not expecting v3 as the series was in for-next so I did some edits
[...]

Scratch that, this is not the same series.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large()
  2025-03-17 15:13   ` David Sterba
@ 2025-03-31  4:47     ` Qu Wenruo
  2025-04-03 19:54       ` David Sterba
  0 siblings, 1 reply; 19+ messages in thread
From: Qu Wenruo @ 2025-03-31  4:47 UTC (permalink / raw)
  To: dsterba; +Cc: Qu Wenruo, linux-btrfs



在 2025/3/18 01:43, David Sterba 写道:
> On Mon, Mar 17, 2025 at 02:55:02PM +0100, David Sterba wrote:
>> On Mon, Mar 17, 2025 at 05:40:45PM +1030, Qu Wenruo wrote:
>>> [CHANGELOG]
>>> v3:
>>> - Prepare btrfs_end_repair_bio() to support larger folios
>>>    Unfortunately folio_iter structure is way too large compared to
>>>    bvec_iter, thus it's not a good idea to convert bbio::saved_iter to
>>>    folio_iter.
>>>
>>>    Thankfully it's not that complex to grab the folio from a bio_vec.
>>>
>>> - Add a new patch to prepare defrag for larger data folios
>>
>> I was not expecting v3 as the series was in for-next so I did some edits
> [...]
>
> Scratch that, this is not the same series.
>

BTW, any extra commends needs to be addressed? (E.g. should I merge them
all into a single patch?)

I notice several small comment conflicts ("larger folio -> large
folio"), but is very easy to resolve (local branch updated).

There is a newer series that is already mostly reviewed:

https://lore.kernel.org/linux-btrfs/cover.1743239672.git.wqu@suse.com/

That relies on this series, and I'm already doing some basic (fsstress
runs) large folio tests now.

So I'm wondering can I push the series now, or should I continue waiting
for extra reviews?

Thanks,
Qu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large()
  2025-03-31  4:47     ` Qu Wenruo
@ 2025-04-03 19:54       ` David Sterba
  2025-04-03 21:45         ` Qu Wenruo
  0 siblings, 1 reply; 19+ messages in thread
From: David Sterba @ 2025-04-03 19:54 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, linux-btrfs

On Mon, Mar 31, 2025 at 03:17:35PM +1030, Qu Wenruo wrote:
> 
> 
> 在 2025/3/18 01:43, David Sterba 写道:
> > On Mon, Mar 17, 2025 at 02:55:02PM +0100, David Sterba wrote:
> >> On Mon, Mar 17, 2025 at 05:40:45PM +1030, Qu Wenruo wrote:
> >>> [CHANGELOG]
> >>> v3:
> >>> - Prepare btrfs_end_repair_bio() to support larger folios
> >>>    Unfortunately folio_iter structure is way too large compared to
> >>>    bvec_iter, thus it's not a good idea to convert bbio::saved_iter to
> >>>    folio_iter.
> >>>
> >>>    Thankfully it's not that complex to grab the folio from a bio_vec.
> >>>
> >>> - Add a new patch to prepare defrag for larger data folios
> >>
> >> I was not expecting v3 as the series was in for-next so I did some edits
> > [...]
> >
> > Scratch that, this is not the same series.
> >
> 
> BTW, any extra commends needs to be addressed? (E.g. should I merge them
> all into a single patch?)

I think the patch separation is good, please leave it as it is.

> I notice several small comment conflicts ("larger folio -> large
> folio"), but is very easy to resolve (local branch updated).
> 
> There is a newer series that is already mostly reviewed:
> 
> https://lore.kernel.org/linux-btrfs/cover.1743239672.git.wqu@suse.com/
> 
> That relies on this series, and I'm already doing some basic (fsstress
> runs) large folio tests now.
> 
> So I'm wondering can I push the series now, or should I continue waiting
> for extra reviews?

Please add it to for-next. I did only a light review, we'll find more
things during testing.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large()
  2025-04-03 19:54       ` David Sterba
@ 2025-04-03 21:45         ` Qu Wenruo
  0 siblings, 0 replies; 19+ messages in thread
From: Qu Wenruo @ 2025-04-03 21:45 UTC (permalink / raw)
  To: dsterba, Qu Wenruo; +Cc: linux-btrfs



在 2025/4/4 06:24, David Sterba 写道:
> On Mon, Mar 31, 2025 at 03:17:35PM +1030, Qu Wenruo wrote:
>>
>>
>> 在 2025/3/18 01:43, David Sterba 写道:
>>> On Mon, Mar 17, 2025 at 02:55:02PM +0100, David Sterba wrote:
>>>> On Mon, Mar 17, 2025 at 05:40:45PM +1030, Qu Wenruo wrote:
>>>>> [CHANGELOG]
>>>>> v3:
>>>>> - Prepare btrfs_end_repair_bio() to support larger folios
>>>>>     Unfortunately folio_iter structure is way too large compared to
>>>>>     bvec_iter, thus it's not a good idea to convert bbio::saved_iter to
>>>>>     folio_iter.
>>>>>
>>>>>     Thankfully it's not that complex to grab the folio from a bio_vec.
>>>>>
>>>>> - Add a new patch to prepare defrag for larger data folios
>>>>
>>>> I was not expecting v3 as the series was in for-next so I did some edits
>>> [...]
>>>
>>> Scratch that, this is not the same series.
>>>
>>
>> BTW, any extra commends needs to be addressed? (E.g. should I merge them
>> all into a single patch?)
> 
> I think the patch separation is good, please leave it as it is.
> 
>> I notice several small comment conflicts ("larger folio -> large
>> folio"), but is very easy to resolve (local branch updated).
>>
>> There is a newer series that is already mostly reviewed:
>>
>> https://lore.kernel.org/linux-btrfs/cover.1743239672.git.wqu@suse.com/
>>
>> That relies on this series, and I'm already doing some basic (fsstress
>> runs) large folio tests now.
>>
>> So I'm wondering can I push the series now, or should I continue waiting
>> for extra reviews?
> 
> Please add it to for-next. I did only a light review, we'll find more
> things during testing.

Thanks, but it looks like the read repair and defrag part (the last two 
patches) should be delayed a little.

I'll push the first 7 safe ASSERT() removals into for-next first.


As I finally fixed the last bug exposed by fsstress runs, I can continue 
with fstests runs.

Instead of the untested defrag and read-repair patches, I can do proper 
test and fix (since the read-repair one seems to cause bugs during my 
local tests) bugs in them.

Thanks,
Qu


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-04-07  7:11 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-17  7:10 [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 1/9] btrfs: send: remove the again label inside put_file_data() Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 2/9] btrfs: send: prepare put_file_data() for larger data folios Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 3/9] btrfs: prepare btrfs_page_mkwrite() " Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 4/9] btrfs: prepare prepare_one_folio() " Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 5/9] btrfs: prepare end_bbio_data_write() " Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 6/9] btrfs: subpage: prepare " Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 7/9] btrfs: zlib: prepare copy_data_into_buffer() " Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 8/9] btrfs: prepare btrfs_end_repair_bio() " Qu Wenruo
2025-04-03 19:56   ` David Sterba
2025-04-06 23:19   ` Qu Wenruo
2025-04-07  6:18   ` Christoph Hellwig
2025-04-07  7:11     ` Qu Wenruo
2025-03-17  7:10 ` [PATCH v3 9/9] btrfs: enable larger data folios support for defrag Qu Wenruo
2025-03-17 13:55 ` [PATCH v3 0/9] btrfs: remove ASSERT()s for folio_order() and folio_test_large() David Sterba
2025-03-17 15:13   ` David Sterba
2025-03-31  4:47     ` Qu Wenruo
2025-04-03 19:54       ` David Sterba
2025-04-03 21:45         ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox