public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: fix corner cases for subpage generic/363 failures
@ 2025-04-11  5:13 Qu Wenruo
  2025-04-11  5:14 ` [PATCH 1/2] btrfs: make btrfs_truncate_block() to zero involved blocks in a folio Qu Wenruo
  2025-04-11  5:14 ` [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases Qu Wenruo
  0 siblings, 2 replies; 15+ messages in thread
From: Qu Wenruo @ 2025-04-11  5:13 UTC (permalink / raw)
  To: linux-btrfs

Test case generic/363 always fail on subpage (fs block fs < page size)
btrfses, there are mostly two kinds of problems here:

All examples are based on 64K page size and 4K fs block size.

1) EOF is polluted and btrfs_truncate_block() only zeros the block that
   needs to be written back

   
   0                           32K                           64K
   |                           |              |GGGGGGGGGGGGGG|
                                              50K EOF
   The original file is 50K sized (not 4K aligned), and fsx polluted the
   range beyond EOF through memory mapped write.
   And since memory mapped write is page based, and our page size is
   larger than block size, the page range [0, 64K) covere blocks beyond
   EOF.

   Those polluted range will not be written back, but will still affect
   our page cache.

   Then some operation happens to expand the inode to size 64K.

   In that case btrfs_truncate_block() is called to trim the block
   [48K, 52K), and that block will be marked dirty for written back.

   But the range [52K, 64K) is untouched at all, left the garbage
   hanging there, triggering `fsx -e 1` failure.

   Fix this case by force btrfs_truncate_block() to zeroing any involved
   blocks. (Meanwhile still only one block [48K, 52K) will be written
   back)

2) EOF is polluted and the original size is block aligned so
   btrfs_truncate_block() does nothing

   0                           32K                           64K
   |                           |                |GGGGGGGGGGGG|
                                                52K EOF

   Mostly the same as case 1, but this time since the inode size is
   block aligned, btrfs_truncate_block() will do nothing.

   Leaving the garbage range [52K, 64K) untouched and fail `fsx -e 1`
   runs.

   Fix this case by force btrfs_truncate_block() to zeroing any involved
   blocks when the btrfs is subpage and the range is aligned.
   This will not cause any new dirty blocks, but purely zeroing out EOF
   to pass `fsx -e 1` runs.

Qu Wenruo (2):
  btrfs: make btrfs_truncate_block() to zero involved blocks in a folio
  btrfs: make btrfs_truncate_block() zero folio range for certain
    subpage corner cases

 fs/btrfs/btrfs_inode.h |  10 ++-
 fs/btrfs/file.c        |  33 ++++++---
 fs/btrfs/inode.c       | 148 ++++++++++++++++++++++++++++++++++-------
 3 files changed, 155 insertions(+), 36 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] btrfs: make btrfs_truncate_block() to zero involved blocks in a folio
  2025-04-11  5:13 [PATCH 0/2] btrfs: fix corner cases for subpage generic/363 failures Qu Wenruo
@ 2025-04-11  5:14 ` Qu Wenruo
  2025-04-11  7:02   ` Qu Wenruo
  2025-04-11  5:14 ` [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases Qu Wenruo
  1 sibling, 1 reply; 15+ messages in thread
From: Qu Wenruo @ 2025-04-11  5:14 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
The following fsx sequence will fail on btrfs with 64K page size and 4K
fs block size:

 #fsx -d -e 1 -N 4 $mnt/junk -S 36386
 READ BAD DATA: offset = 0xe9ba, size = 0x6dd5, fname = /mnt/btrfs/junk
 OFFSET      GOOD    BAD     RANGE
 0xe9ba      0x0000  0x03ac  0x0
 operation# (mod 256) for the bad data may be 3
 ...
 LOG DUMP (4 total operations):
 1(  1 mod 256): WRITE    0x6c62 thru 0x1147d	(0xa81c bytes) HOLE	***WWWW
 2(  2 mod 256): TRUNCATE DOWN	from 0x1147e to 0x5448	******WWWW
 3(  3 mod 256): ZERO     0x1c7aa thru 0x28fe2	(0xc839 bytes)
 4(  4 mod 256): MAPREAD  0xe9ba thru 0x1578e	(0x6dd5 bytes)	***RRRR***

[CAUSE]
Only 2 operations are really involved in this case:

 3 pollute_eof	0x5448 thru	0xffff	(0xabb8 bytes)
 3 zero	from 0x1c7aa to 0x28fe3, (0xc839 bytes)
 4 mapread	0xe9ba thru	0x1578e	(0x6dd5 bytes)

At operation 3, fsx pollutes beyond EOF, that is done by mmap()
and write into that mmap() range beyondd EOF.

Such write will fill the range beyond EOF, but it will never reach disk
as ranges beyond EOF will not be marked dirty nor uptodate.

Then we zero_range for [0x1c7aa, 0x28fe3], and since the range is beyond
our isize (which was 0x5448), we should zero out any range beyond
EOF (0x5448).

During btrfs_zero_range(), we call btrfs_truncate_block() to dirty the
unaligned head block.
But that function only really zero out the block at [0x5000, 0x5fff], it
doesn't bother any range other that that block, since those range will
not be marked dirty nor written back.

So the range [0x6000, 0xffff] is still polluted, and later mapread()
will return the poisoned value.

Such behavior is only exposed when page size is larger than fs block
btrfs, as for block size == page size case the block is exactly one
page, and fsx only checks exactly one page at EOF.

[FIX]
Enhance btrfs_truncate_block() by:

- Force callers to pass a @start/@end combination
  So that there will be no 0 length passed in.

- Rename the @front parameter to an enum
  And make it matches the @start/@end parameter better by using
  TRUNCATE_HEAD_BLOCK and TRUNCATE_TAIL_BLOCK instead.

- Pass the original unmodified range into btrfs_truncate_block()
  There are several call sites inside btrfs_zero_range() and
  btrfs_punch_hole() where we pass part of the original range for
  truncating.

  This hides the original range which can lead to under or over
  truncating.
  Thus we have to pass the original zero/punch range.

- Make btrfs_truncate_block() to zero any involved blocks inside the folio
  Since we have the original range, we know exactly which range inside
  the folio that should be zeroed.

  It may cover other blocks other than the one with data space reserved,
  but that's fine, the zeroed range will not be written back anyway.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/btrfs_inode.h | 10 +++++--
 fs/btrfs/file.c        | 33 ++++++++++++++-------
 fs/btrfs/inode.c       | 65 +++++++++++++++++++++++++++---------------
 3 files changed, 73 insertions(+), 35 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 4e2952cf5766..21b005ddf42c 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -547,8 +547,14 @@ int btrfs_add_link(struct btrfs_trans_handle *trans,
 		   struct btrfs_inode *parent_inode, struct btrfs_inode *inode,
 		   const struct fscrypt_str *name, int add_backref, u64 index);
 int btrfs_delete_subvolume(struct btrfs_inode *dir, struct dentry *dentry);
-int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
-			 int front);
+
+enum btrfs_truncate_where {
+	BTRFS_TRUNCATE_HEAD_BLOCK,
+	BTRFS_TRUNCATE_TAIL_BLOCK,
+};
+int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t end,
+			 u64 orig_start, u64 orig_end,
+			 enum btrfs_truncate_where where);
 
 int btrfs_start_delalloc_snapshot(struct btrfs_root *root, bool in_reclaim_context);
 int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, long nr,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index e3fea1db4304..55fa91799fb6 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2616,7 +2616,8 @@ static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len)
 	u64 lockend;
 	u64 tail_start;
 	u64 tail_len;
-	u64 orig_start = offset;
+	const u64 orig_start = offset;
+	const u64 orig_end = offset + len - 1;
 	int ret = 0;
 	bool same_block;
 	u64 ino_size;
@@ -2659,7 +2660,8 @@ static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len)
 		if (offset < ino_size) {
 			truncated_block = true;
 			ret = btrfs_truncate_block(BTRFS_I(inode), offset, len,
-						   0);
+						   orig_start, orig_end,
+						   BTRFS_TRUNCATE_HEAD_BLOCK);
 		} else {
 			ret = 0;
 		}
@@ -2669,7 +2671,9 @@ static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len)
 	/* zero back part of the first block */
 	if (offset < ino_size) {
 		truncated_block = true;
-		ret = btrfs_truncate_block(BTRFS_I(inode), offset, 0, 0);
+		ret = btrfs_truncate_block(BTRFS_I(inode), offset, -1,
+					   orig_start, orig_end,
+					   BTRFS_TRUNCATE_HEAD_BLOCK);
 		if (ret) {
 			btrfs_inode_unlock(BTRFS_I(inode), BTRFS_ILOCK_MMAP);
 			return ret;
@@ -2706,8 +2710,9 @@ static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len)
 			if (tail_start + tail_len < ino_size) {
 				truncated_block = true;
 				ret = btrfs_truncate_block(BTRFS_I(inode),
-							tail_start + tail_len,
-							0, 1);
+						tail_start, tail_start + tail_len - 1,
+						orig_start, orig_end,
+						BTRFS_TRUNCATE_TAIL_BLOCK);
 				if (ret)
 					goto out_only_mutex;
 			}
@@ -2875,6 +2880,8 @@ static int btrfs_zero_range(struct inode *inode,
 	int ret;
 	u64 alloc_hint = 0;
 	const u64 sectorsize = fs_info->sectorsize;
+	const u64 orig_start = offset;
+	const u64 orig_end = offset + len - 1;
 	u64 alloc_start = round_down(offset, sectorsize);
 	u64 alloc_end = round_up(offset + len, sectorsize);
 	u64 bytes_to_reserve = 0;
@@ -2938,7 +2945,8 @@ static int btrfs_zero_range(struct inode *inode,
 		if (len < sectorsize && em->disk_bytenr != EXTENT_MAP_HOLE) {
 			free_extent_map(em);
 			ret = btrfs_truncate_block(BTRFS_I(inode), offset, len,
-						   0);
+						   orig_start, orig_end,
+						   BTRFS_TRUNCATE_HEAD_BLOCK);
 			if (!ret)
 				ret = btrfs_fallocate_update_isize(inode,
 								   offset + len,
@@ -2969,7 +2977,9 @@ static int btrfs_zero_range(struct inode *inode,
 			alloc_start = round_down(offset, sectorsize);
 			ret = 0;
 		} else if (ret == RANGE_BOUNDARY_WRITTEN_EXTENT) {
-			ret = btrfs_truncate_block(BTRFS_I(inode), offset, 0, 0);
+			ret = btrfs_truncate_block(BTRFS_I(inode), offset, -1,
+						   orig_start, orig_end,
+						   BTRFS_TRUNCATE_HEAD_BLOCK);
 			if (ret)
 				goto out;
 		} else {
@@ -2986,8 +2996,9 @@ static int btrfs_zero_range(struct inode *inode,
 			alloc_end = round_up(offset + len, sectorsize);
 			ret = 0;
 		} else if (ret == RANGE_BOUNDARY_WRITTEN_EXTENT) {
-			ret = btrfs_truncate_block(BTRFS_I(inode), offset + len,
-						   0, 1);
+			ret = btrfs_truncate_block(BTRFS_I(inode), offset, offset + len - 1,
+						   orig_start, orig_end,
+						   BTRFS_TRUNCATE_TAIL_BLOCK);
 			if (ret)
 				goto out;
 		} else {
@@ -3107,7 +3118,9 @@ static long btrfs_fallocate(struct file *file, int mode,
 		 * need to zero out the end of the block if i_size lands in the
 		 * middle of a block.
 		 */
-		ret = btrfs_truncate_block(BTRFS_I(inode), inode->i_size, 0, 0);
+		ret = btrfs_truncate_block(BTRFS_I(inode), inode->i_size, -1,
+					   inode->i_size, -1,
+					   BTRFS_TRUNCATE_HEAD_BLOCK);
 		if (ret)
 			goto out;
 	}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e283627c087d..0700a161b80e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4782,15 +4782,16 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
  *
  * @inode - inode that we're zeroing
  * @from - the offset to start zeroing
- * @len - the length to zero, 0 to zero the entire range respective to the
- *	offset
- * @front - zero up to the offset instead of from the offset on
+ * @end - the inclusive end to finish zeroing, can be -1 meaning truncating
+ *	  everything beyond @from.
+ * @where - Head or tail block to truncate.
  *
  * This will find the block for the "from" offset and cow the block and zero the
  * part we want to zero.  This is used with truncate and hole punching.
  */
-int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
-			 int front)
+int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t end,
+			 u64 orig_start, u64 orig_end,
+			 enum btrfs_truncate_where where)
 {
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct address_space *mapping = inode->vfs_inode.i_mapping;
@@ -4800,20 +4801,30 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
 	struct extent_changeset *data_reserved = NULL;
 	bool only_release_metadata = false;
 	u32 blocksize = fs_info->sectorsize;
-	pgoff_t index = from >> PAGE_SHIFT;
-	unsigned offset = from & (blocksize - 1);
+	pgoff_t index = (where == BTRFS_TRUNCATE_HEAD_BLOCK) ?
+			(from >> PAGE_SHIFT) : (end >> PAGE_SHIFT);
 	struct folio *folio;
 	gfp_t mask = btrfs_alloc_write_mask(mapping);
 	size_t write_bytes = blocksize;
 	int ret = 0;
 	u64 block_start;
 	u64 block_end;
+	u64 clamp_start;
+	u64 clamp_end;
 
-	if (IS_ALIGNED(offset, blocksize) &&
-	    (!len || IS_ALIGNED(len, blocksize)))
+	ASSERT(where == BTRFS_TRUNCATE_HEAD_BLOCK ||
+	       where == BTRFS_TRUNCATE_TAIL_BLOCK);
+
+	if (end == (loff_t)-1)
+		ASSERT(where == BTRFS_TRUNCATE_HEAD_BLOCK);
+
+	if (IS_ALIGNED(from, blocksize) && IS_ALIGNED(end + 1, blocksize))
 		goto out;
 
-	block_start = round_down(from, blocksize);
+	if (where == BTRFS_TRUNCATE_HEAD_BLOCK)
+		block_start = round_down(from, blocksize);
+	else
+		block_start = round_down(end, blocksize);
 	block_end = block_start + blocksize - 1;
 
 	ret = btrfs_check_data_free_space(inode, &data_reserved, block_start,
@@ -4893,17 +4904,22 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
 		goto out_unlock;
 	}
 
-	if (offset != blocksize) {
-		if (!len)
-			len = blocksize - offset;
-		if (front)
-			folio_zero_range(folio, block_start - folio_pos(folio),
-					 offset);
-		else
-			folio_zero_range(folio,
-					 (block_start - folio_pos(folio)) + offset,
-					 len);
-	}
+	/*
+	 * Although we have only reserved space for the one block, we still should
+	 * zero out all blocks in the original range.
+	 * The remaining blocks normally are already holes thus no need to zero again,
+	 * but it's possible for fs block size < page size cases to have memory mapped
+	 * writes to pollute ranges beyond EOF.
+	 *
+	 * In that case although the polluted blocks beyond EOF will not reach disk,
+	 * it still affects our page cache.
+	 */
+	clamp_start = max_t(u64, folio_pos(folio), orig_start);
+	clamp_end = min_t(u64, folio_pos(folio) + folio_size(folio) - 1,
+			  orig_end);
+	folio_zero_range(folio, clamp_start - folio_pos(folio),
+			 clamp_end - clamp_start + 1);
+
 	btrfs_folio_clear_checked(fs_info, folio, block_start,
 				  block_end + 1 - block_start);
 	btrfs_folio_set_dirty(fs_info, folio, block_start,
@@ -5005,7 +5021,8 @@ int btrfs_cont_expand(struct btrfs_inode *inode, loff_t oldsize, loff_t size)
 	 * rest of the block before we expand the i_size, otherwise we could
 	 * expose stale data.
 	 */
-	ret = btrfs_truncate_block(inode, oldsize, 0, 0);
+	ret = btrfs_truncate_block(inode, oldsize, -1, oldsize, -1,
+				   BTRFS_TRUNCATE_HEAD_BLOCK);
 	if (ret)
 		return ret;
 
@@ -7649,7 +7666,9 @@ static int btrfs_truncate(struct btrfs_inode *inode, bool skip_writeback)
 		btrfs_end_transaction(trans);
 		btrfs_btree_balance_dirty(fs_info);
 
-		ret = btrfs_truncate_block(inode, inode->vfs_inode.i_size, 0, 0);
+		ret = btrfs_truncate_block(inode, inode->vfs_inode.i_size, -1,
+					   inode->vfs_inode.i_size, -1,
+					   BTRFS_TRUNCATE_HEAD_BLOCK);
 		if (ret)
 			goto out;
 		trans = btrfs_start_transaction(root, 1);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-11  5:13 [PATCH 0/2] btrfs: fix corner cases for subpage generic/363 failures Qu Wenruo
  2025-04-11  5:14 ` [PATCH 1/2] btrfs: make btrfs_truncate_block() to zero involved blocks in a folio Qu Wenruo
@ 2025-04-11  5:14 ` Qu Wenruo
  2025-04-12  5:12   ` kernel test robot
                     ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Qu Wenruo @ 2025-04-11  5:14 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
For the following fsx -e 1 run, the btrfs still fails the run on 64K
page size with 4K fs block size:

READ BAD DATA: offset = 0x26b3a, size = 0xfafa, fname = /mnt/btrfs/junk
OFFSET      GOOD    BAD     RANGE
0x26b3a     0x0000  0x15b4  0x0
operation# (mod 256) for the bad data may be 21
[...]
LOG DUMP (28 total operations):
1(  1 mod 256): SKIPPED (no operation)
2(  2 mod 256): SKIPPED (no operation)
3(  3 mod 256): SKIPPED (no operation)
4(  4 mod 256): SKIPPED (no operation)
5(  5 mod 256): WRITE    0x1ea90 thru 0x285e0	(0x9b51 bytes) HOLE
6(  6 mod 256): ZERO     0x1b1a8 thru 0x20bd4	(0x5a2d bytes)
7(  7 mod 256): FALLOC   0x22b1a thru 0x272fa	(0x47e0 bytes) INTERIOR
8(  8 mod 256): WRITE    0x741d thru 0x13522	(0xc106 bytes)
9(  9 mod 256): MAPWRITE 0x73ee thru 0xdeeb	(0x6afe bytes)
10( 10 mod 256): FALLOC   0xb719 thru 0xb994	(0x27b bytes) INTERIOR
11( 11 mod 256): COPY 0x15ed8 thru 0x18be1	(0x2d0a bytes) to 0x25f6e thru 0x28c77
12( 12 mod 256): ZERO     0x1615e thru 0x1770e	(0x15b1 bytes)
13( 13 mod 256): SKIPPED (no operation)
14( 14 mod 256): DEDUPE 0x20000 thru 0x27fff	(0x8000 bytes) to 0x1000 thru 0x8fff
15( 15 mod 256): SKIPPED (no operation)
16( 16 mod 256): CLONE 0xa000 thru 0xffff	(0x6000 bytes) to 0x36000 thru 0x3bfff
17( 17 mod 256): ZERO     0x14adc thru 0x1b78a	(0x6caf bytes)
18( 18 mod 256): TRUNCATE DOWN	from 0x3c000 to 0x1e2e3	******WWWW
19( 19 mod 256): CLONE 0x4000 thru 0x11fff	(0xe000 bytes) to 0x16000 thru 0x23fff
20( 20 mod 256): FALLOC   0x311e1 thru 0x3681b	(0x563a bytes) PAST_EOF
21( 21 mod 256): FALLOC   0x351c5 thru 0x40000	(0xae3b bytes) EXTENDING
22( 22 mod 256): WRITE    0x920 thru 0x7e51	(0x7532 bytes)
23( 23 mod 256): COPY 0x2b58 thru 0xc508	(0x99b1 bytes) to 0x117b1 thru 0x1b161
24( 24 mod 256): TRUNCATE DOWN	from 0x40000 to 0x3c9a5
25( 25 mod 256): SKIPPED (no operation)
26( 26 mod 256): MAPWRITE 0x25020 thru 0x26b06	(0x1ae7 bytes)
27( 27 mod 256): SKIPPED (no operation)
28( 28 mod 256): READ     0x26b3a thru 0x36633	(0xfafa bytes)	***RRRR***

[CAUSE]
The involved operations are:

 fallocating to largest ever: 0x40000
 21 pollute_eof	0x24000 thru	0x2ffff	(0xc000 bytes)
 21 falloc	from 0x351c5 to 0x40000 (0xae3b bytes)
 28 read	0x26b3a thru	0x36633	(0xfafa bytes)

At operation #21 a pollute_eof is done, by memory mappaed write into
range [0x24000, 0x2ffff).
At this stage, the inode size is 0x24000, which is block aligned.

Then fallocate happens, and since it's expanding the inode, it will call
btrfs_truncate_block() to truncate any unaligned range.

But since the inode size is already block aligned,
btrfs_truncate_block() does nothing and exit.

However remember the folio at 0x20000 has some range polluted already,
although they will not be written back to disk, it still affects the
page cache, resulting the later operation #28 to read out the polluted
value.

[FIX]
Instead of early exit from btrfs_truncate_block() if the range is
already block aligned, do extra filio zeroing if the fs block size is
smaller than the page size.

This is to address exactly the above case where memory mapped write can
still leave some garbage beyond EOF.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/inode.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 82 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0700a161b80e..2dc9b565f1f1 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4777,6 +4777,87 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
 	return ret;
 }
 
+/*
+ * A helper to zero out all blocks inside range [@orig_start, @orig_end) of
+ * the target folio.
+ * The target folio is the one containing the head or tail block of the range
+ * [@from, @end].
+ *
+ * This is a special case for fs block size < page size, where even if the range
+ * [from, end] is already block aligned, we can still have blocks beyond EOF being
+ * polluted by memory mapped write.
+ */
+static int zero_range_folio(struct btrfs_inode *inode, loff_t from, loff_t end,
+			    u64 orig_start, u64 orig_end,
+			    enum btrfs_truncate_where where)
+{
+	const u32 blocksize = inode->root->fs_info->sectorsize;
+	struct address_space *mapping = inode->vfs_inode.i_mapping;
+	struct extent_io_tree *io_tree = &inode->io_tree;
+	struct extent_state *cached_state = NULL;
+	struct btrfs_ordered_extent *ordered;
+	pgoff_t index = (where == BTRFS_TRUNCATE_HEAD_BLOCK) ?
+			(from >> PAGE_SHIFT) : (end >> PAGE_SHIFT);
+	struct folio *folio;
+	u64 block_start;
+	u64 block_end;
+	u64 clamp_start;
+	u64 clamp_end;
+	int ret = 0;
+
+	/*
+	 * The target head/tail block is already block aligned.
+	 * If block size >= PAGE_SIZE, meaning it's impossible to mmap a
+	 * page containing anything other than the target block.
+	 */
+	if (blocksize >= PAGE_SIZE)
+		return 0;
+again:
+	folio = filemap_lock_folio(mapping, index);
+	/* No folio present. */
+	if (IS_ERR(folio))
+		return 0;
+
+	if (!folio_test_uptodate(folio)) {
+		ret = btrfs_read_folio(NULL, folio);
+		folio_lock(folio);
+		if (folio->mapping != mapping) {
+			folio_unlock(folio);
+			folio_put(folio);
+			goto again;
+		}
+		if (!folio_test_uptodate(folio)) {
+			ret = -EIO;
+			goto out_unlock;
+		}
+	}
+	folio_wait_writeback(folio);
+
+	clamp_start = max_t(u64, folio_pos(folio), orig_start);
+	clamp_end = min_t(u64, folio_pos(folio) + folio_size(folio) - 1,
+			  orig_end);
+	block_start = round_down(clamp_start, block_size);
+	block_end = round_up(clamp_end + 1, block_size) - 1;
+	lock_extent(io_tree, block_start, block_end, &cached_state);
+	ordered = btrfs_lookup_ordered_range(inode, block_start, block_end + 1 - block_end);
+	if (ordered) {
+		unlock_extent(io_tree, block_start, block_end, &cached_state);
+		folio_unlock(folio);
+		folio_put(folio);
+		btrfs_start_ordered_extent(ordered);
+		btrfs_put_ordered_extent(ordered);
+		goto again;
+	}
+	folio_zero_range(folio, clamp_start - folio_pos(folio),
+			 clamp_end - clamp_start + 1);
+	unlock_extent(io_tree, block_start, block_end, &cached_state);
+
+out_unlock:
+	folio_unlock(folio);
+	folio_put(folio);
+	return ret;
+}
+
 /*
  * Read, zero a chunk and write a block.
  *
@@ -4819,7 +4900,7 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t end,
 		ASSERT(where == BTRFS_TRUNCATE_HEAD_BLOCK);
 
 	if (IS_ALIGNED(from, blocksize) && IS_ALIGNED(end + 1, blocksize))
-		goto out;
+		return zero_range_folio(inode, from, end, orig_start, orig_end, where);
 
 	if (where == BTRFS_TRUNCATE_HEAD_BLOCK)
 		block_start = round_down(from, blocksize);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] btrfs: make btrfs_truncate_block() to zero involved blocks in a folio
  2025-04-11  5:14 ` [PATCH 1/2] btrfs: make btrfs_truncate_block() to zero involved blocks in a folio Qu Wenruo
@ 2025-04-11  7:02   ` Qu Wenruo
  0 siblings, 0 replies; 15+ messages in thread
From: Qu Wenruo @ 2025-04-11  7:02 UTC (permalink / raw)
  To: linux-btrfs



在 2025/4/11 14:44, Qu Wenruo 写道:
> [BUG]
> The following fsx sequence will fail on btrfs with 64K page size and 4K
> fs block size:
> 
>   #fsx -d -e 1 -N 4 $mnt/junk -S 36386
>   READ BAD DATA: offset = 0xe9ba, size = 0x6dd5, fname = /mnt/btrfs/junk
>   OFFSET      GOOD    BAD     RANGE
>   0xe9ba      0x0000  0x03ac  0x0
>   operation# (mod 256) for the bad data may be 3
>   ...
>   LOG DUMP (4 total operations):
>   1(  1 mod 256): WRITE    0x6c62 thru 0x1147d	(0xa81c bytes) HOLE	***WWWW
>   2(  2 mod 256): TRUNCATE DOWN	from 0x1147e to 0x5448	******WWWW
>   3(  3 mod 256): ZERO     0x1c7aa thru 0x28fe2	(0xc839 bytes)
>   4(  4 mod 256): MAPREAD  0xe9ba thru 0x1578e	(0x6dd5 bytes)	***RRRR***
> 
> [CAUSE]
> Only 2 operations are really involved in this case:
> 
>   3 pollute_eof	0x5448 thru	0xffff	(0xabb8 bytes)
>   3 zero	from 0x1c7aa to 0x28fe3, (0xc839 bytes)
>   4 mapread	0xe9ba thru	0x1578e	(0x6dd5 bytes)
> 
> At operation 3, fsx pollutes beyond EOF, that is done by mmap()
> and write into that mmap() range beyondd EOF.
> 
> Such write will fill the range beyond EOF, but it will never reach disk
> as ranges beyond EOF will not be marked dirty nor uptodate.
> 
> Then we zero_range for [0x1c7aa, 0x28fe3], and since the range is beyond
> our isize (which was 0x5448), we should zero out any range beyond
> EOF (0x5448).
> 
> During btrfs_zero_range(), we call btrfs_truncate_block() to dirty the
> unaligned head block.
> But that function only really zero out the block at [0x5000, 0x5fff], it
> doesn't bother any range other that that block, since those range will
> not be marked dirty nor written back.
> 
> So the range [0x6000, 0xffff] is still polluted, and later mapread()
> will return the poisoned value.
> 
> Such behavior is only exposed when page size is larger than fs block
> btrfs, as for block size == page size case the block is exactly one
> page, and fsx only checks exactly one page at EOF.
> 
> [FIX]
> Enhance btrfs_truncate_block() by:
> 
> - Force callers to pass a @start/@end combination
>    So that there will be no 0 length passed in.
> 
> - Rename the @front parameter to an enum
>    And make it matches the @start/@end parameter better by using
>    TRUNCATE_HEAD_BLOCK and TRUNCATE_TAIL_BLOCK instead.
> 
> - Pass the original unmodified range into btrfs_truncate_block()
>    There are several call sites inside btrfs_zero_range() and
>    btrfs_punch_hole() where we pass part of the original range for
>    truncating.
> 
>    This hides the original range which can lead to under or over
>    truncating.
>    Thus we have to pass the original zero/punch range.
> 
> - Make btrfs_truncate_block() to zero any involved blocks inside the folio
>    Since we have the original range, we know exactly which range inside
>    the folio that should be zeroed.
> 
>    It may cover other blocks other than the one with data space reserved,
>    but that's fine, the zeroed range will not be written back anyway.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/btrfs_inode.h | 10 +++++--
>   fs/btrfs/file.c        | 33 ++++++++++++++-------
>   fs/btrfs/inode.c       | 65 +++++++++++++++++++++++++++---------------
>   3 files changed, 73 insertions(+), 35 deletions(-)
> 
> diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
> index 4e2952cf5766..21b005ddf42c 100644
> --- a/fs/btrfs/btrfs_inode.h
> +++ b/fs/btrfs/btrfs_inode.h
> @@ -547,8 +547,14 @@ int btrfs_add_link(struct btrfs_trans_handle *trans,
>   		   struct btrfs_inode *parent_inode, struct btrfs_inode *inode,
>   		   const struct fscrypt_str *name, int add_backref, u64 index);
>   int btrfs_delete_subvolume(struct btrfs_inode *dir, struct dentry *dentry);
> -int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
> -			 int front);
> +
> +enum btrfs_truncate_where {
> +	BTRFS_TRUNCATE_HEAD_BLOCK,
> +	BTRFS_TRUNCATE_TAIL_BLOCK,
> +};
> +int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t end,
> +			 u64 orig_start, u64 orig_end,
> +			 enum btrfs_truncate_where where);
>   
>   int btrfs_start_delalloc_snapshot(struct btrfs_root *root, bool in_reclaim_context);
>   int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, long nr,
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index e3fea1db4304..55fa91799fb6 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -2616,7 +2616,8 @@ static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len)
>   	u64 lockend;
>   	u64 tail_start;
>   	u64 tail_len;
> -	u64 orig_start = offset;
> +	const u64 orig_start = offset;
> +	const u64 orig_end = offset + len - 1;
>   	int ret = 0;
>   	bool same_block;
>   	u64 ino_size;
> @@ -2659,7 +2660,8 @@ static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len)
>   		if (offset < ino_size) {
>   			truncated_block = true;
>   			ret = btrfs_truncate_block(BTRFS_I(inode), offset, len,

This is a bug, since the 3rd parameter is the inclusive end now, it 
should be "offset + len - 1".

This leads to generic/008 failure on x86_64.> -						   0);
> +						   orig_start, orig_end,
> +						   BTRFS_TRUNCATE_HEAD_BLOCK);
>   		} else {
>   			ret = 0;
>   		}
> @@ -2669,7 +2671,9 @@ static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len)
>   	/* zero back part of the first block */
>   	if (offset < ino_size) {
>   		truncated_block = true;
> -		ret = btrfs_truncate_block(BTRFS_I(inode), offset, 0, 0);
> +		ret = btrfs_truncate_block(BTRFS_I(inode), offset, -1,
> +					   orig_start, orig_end,
> +					   BTRFS_TRUNCATE_HEAD_BLOCK);
>   		if (ret) {
>   			btrfs_inode_unlock(BTRFS_I(inode), BTRFS_ILOCK_MMAP);
>   			return ret;
> @@ -2706,8 +2710,9 @@ static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len)
>   			if (tail_start + tail_len < ino_size) {
>   				truncated_block = true;
>   				ret = btrfs_truncate_block(BTRFS_I(inode),
> -							tail_start + tail_len,
> -							0, 1);
> +						tail_start, tail_start + tail_len - 1,
> +						orig_start, orig_end,
> +						BTRFS_TRUNCATE_TAIL_BLOCK);
>   				if (ret)
>   					goto out_only_mutex;
>   			}
> @@ -2875,6 +2880,8 @@ static int btrfs_zero_range(struct inode *inode,
>   	int ret;
>   	u64 alloc_hint = 0;
>   	const u64 sectorsize = fs_info->sectorsize;
> +	const u64 orig_start = offset;
> +	const u64 orig_end = offset + len - 1;
>   	u64 alloc_start = round_down(offset, sectorsize);
>   	u64 alloc_end = round_up(offset + len, sectorsize);
>   	u64 bytes_to_reserve = 0;
> @@ -2938,7 +2945,8 @@ static int btrfs_zero_range(struct inode *inode,
>   		if (len < sectorsize && em->disk_bytenr != EXTENT_MAP_HOLE) {
>   			free_extent_map(em);
>   			ret = btrfs_truncate_block(BTRFS_I(inode), offset, len,

The same here.

The idea of changing @len to @end seems cursed, but there seems to be no 
better solution than this.

Will get them fixed in the next update.

Thanks,
Qu

> -						   0);
> +						   orig_start, orig_end,
> +						   BTRFS_TRUNCATE_HEAD_BLOCK);
>   			if (!ret)
>   				ret = btrfs_fallocate_update_isize(inode,
>   								   offset + len,
> @@ -2969,7 +2977,9 @@ static int btrfs_zero_range(struct inode *inode,
>   			alloc_start = round_down(offset, sectorsize);
>   			ret = 0;
>   		} else if (ret == RANGE_BOUNDARY_WRITTEN_EXTENT) {
> -			ret = btrfs_truncate_block(BTRFS_I(inode), offset, 0, 0);
> +			ret = btrfs_truncate_block(BTRFS_I(inode), offset, -1,
> +						   orig_start, orig_end,
> +						   BTRFS_TRUNCATE_HEAD_BLOCK);
>   			if (ret)
>   				goto out;
>   		} else {
> @@ -2986,8 +2996,9 @@ static int btrfs_zero_range(struct inode *inode,
>   			alloc_end = round_up(offset + len, sectorsize);
>   			ret = 0;
>   		} else if (ret == RANGE_BOUNDARY_WRITTEN_EXTENT) {
> -			ret = btrfs_truncate_block(BTRFS_I(inode), offset + len,
> -						   0, 1);
> +			ret = btrfs_truncate_block(BTRFS_I(inode), offset, offset + len - 1,
> +						   orig_start, orig_end,
> +						   BTRFS_TRUNCATE_TAIL_BLOCK);
>   			if (ret)
>   				goto out;
>   		} else {
> @@ -3107,7 +3118,9 @@ static long btrfs_fallocate(struct file *file, int mode,
>   		 * need to zero out the end of the block if i_size lands in the
>   		 * middle of a block.
>   		 */
> -		ret = btrfs_truncate_block(BTRFS_I(inode), inode->i_size, 0, 0);
> +		ret = btrfs_truncate_block(BTRFS_I(inode), inode->i_size, -1,
> +					   inode->i_size, -1,
> +					   BTRFS_TRUNCATE_HEAD_BLOCK);
>   		if (ret)
>   			goto out;
>   	}
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index e283627c087d..0700a161b80e 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -4782,15 +4782,16 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
>    *
>    * @inode - inode that we're zeroing
>    * @from - the offset to start zeroing
> - * @len - the length to zero, 0 to zero the entire range respective to the
> - *	offset
> - * @front - zero up to the offset instead of from the offset on
> + * @end - the inclusive end to finish zeroing, can be -1 meaning truncating
> + *	  everything beyond @from.
> + * @where - Head or tail block to truncate.
>    *
>    * This will find the block for the "from" offset and cow the block and zero the
>    * part we want to zero.  This is used with truncate and hole punching.
>    */
> -int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
> -			 int front)
> +int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t end,
> +			 u64 orig_start, u64 orig_end,
> +			 enum btrfs_truncate_where where)
>   {
>   	struct btrfs_fs_info *fs_info = inode->root->fs_info;
>   	struct address_space *mapping = inode->vfs_inode.i_mapping;
> @@ -4800,20 +4801,30 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
>   	struct extent_changeset *data_reserved = NULL;
>   	bool only_release_metadata = false;
>   	u32 blocksize = fs_info->sectorsize;
> -	pgoff_t index = from >> PAGE_SHIFT;
> -	unsigned offset = from & (blocksize - 1);
> +	pgoff_t index = (where == BTRFS_TRUNCATE_HEAD_BLOCK) ?
> +			(from >> PAGE_SHIFT) : (end >> PAGE_SHIFT);
>   	struct folio *folio;
>   	gfp_t mask = btrfs_alloc_write_mask(mapping);
>   	size_t write_bytes = blocksize;
>   	int ret = 0;
>   	u64 block_start;
>   	u64 block_end;
> +	u64 clamp_start;
> +	u64 clamp_end;
>   
> -	if (IS_ALIGNED(offset, blocksize) &&
> -	    (!len || IS_ALIGNED(len, blocksize)))
> +	ASSERT(where == BTRFS_TRUNCATE_HEAD_BLOCK ||
> +	       where == BTRFS_TRUNCATE_TAIL_BLOCK);
> +
> +	if (end == (loff_t)-1)
> +		ASSERT(where == BTRFS_TRUNCATE_HEAD_BLOCK);
> +
> +	if (IS_ALIGNED(from, blocksize) && IS_ALIGNED(end + 1, blocksize))
>   		goto out;
>   
> -	block_start = round_down(from, blocksize);
> +	if (where == BTRFS_TRUNCATE_HEAD_BLOCK)
> +		block_start = round_down(from, blocksize);
> +	else
> +		block_start = round_down(end, blocksize);
>   	block_end = block_start + blocksize - 1;
>   
>   	ret = btrfs_check_data_free_space(inode, &data_reserved, block_start,
> @@ -4893,17 +4904,22 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len,
>   		goto out_unlock;
>   	}
>   
> -	if (offset != blocksize) {
> -		if (!len)
> -			len = blocksize - offset;
> -		if (front)
> -			folio_zero_range(folio, block_start - folio_pos(folio),
> -					 offset);
> -		else
> -			folio_zero_range(folio,
> -					 (block_start - folio_pos(folio)) + offset,
> -					 len);
> -	}
> +	/*
> +	 * Although we have only reserved space for the one block, we still should
> +	 * zero out all blocks in the original range.
> +	 * The remaining blocks normally are already holes thus no need to zero again,
> +	 * but it's possible for fs block size < page size cases to have memory mapped
> +	 * writes to pollute ranges beyond EOF.
> +	 *
> +	 * In that case although the polluted blocks beyond EOF will not reach disk,
> +	 * it still affects our page cache.
> +	 */
> +	clamp_start = max_t(u64, folio_pos(folio), orig_start);
> +	clamp_end = min_t(u64, folio_pos(folio) + folio_size(folio) - 1,
> +			  orig_end);
> +	folio_zero_range(folio, clamp_start - folio_pos(folio),
> +			 clamp_end - clamp_start + 1);
> +
>   	btrfs_folio_clear_checked(fs_info, folio, block_start,
>   				  block_end + 1 - block_start);
>   	btrfs_folio_set_dirty(fs_info, folio, block_start,
> @@ -5005,7 +5021,8 @@ int btrfs_cont_expand(struct btrfs_inode *inode, loff_t oldsize, loff_t size)
>   	 * rest of the block before we expand the i_size, otherwise we could
>   	 * expose stale data.
>   	 */
> -	ret = btrfs_truncate_block(inode, oldsize, 0, 0);
> +	ret = btrfs_truncate_block(inode, oldsize, -1, oldsize, -1,
> +				   BTRFS_TRUNCATE_HEAD_BLOCK);
>   	if (ret)
>   		return ret;
>   
> @@ -7649,7 +7666,9 @@ static int btrfs_truncate(struct btrfs_inode *inode, bool skip_writeback)
>   		btrfs_end_transaction(trans);
>   		btrfs_btree_balance_dirty(fs_info);
>   
> -		ret = btrfs_truncate_block(inode, inode->vfs_inode.i_size, 0, 0);
> +		ret = btrfs_truncate_block(inode, inode->vfs_inode.i_size, -1,
> +					   inode->vfs_inode.i_size, -1,
> +					   BTRFS_TRUNCATE_HEAD_BLOCK);
>   		if (ret)
>   			goto out;
>   		trans = btrfs_start_transaction(root, 1);


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-11  5:14 ` [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases Qu Wenruo
@ 2025-04-12  5:12   ` kernel test robot
  2025-04-12  5:54   ` kernel test robot
  2025-04-12 18:35   ` Andy Shevchenko
  2 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2025-04-12  5:12 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: oe-kbuild-all

Hi Qu,

kernel test robot noticed the following build warnings:

[auto build test WARNING on kdave/for-next]
[also build test WARNING on linus/master v6.15-rc1 next-20250411]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-make-btrfs_truncate_block-to-zero-involved-blocks-in-a-folio/20250411-131525
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link:    https://lore.kernel.org/r/d66c922e591b3a57a230ca357b9085fe6ae53812.1744344865.git.wqu%40suse.com
patch subject: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
config: arm-randconfig-002-20250412 (https://download.01.org/0day-ci/archive/20250412/202504121224.JBBIEqsn-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 7.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250412/202504121224.JBBIEqsn-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202504121224.JBBIEqsn-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:27:0,
                    from include/linux/cpumask.h:11,
                    from include/linux/smp.h:13,
                    from include/linux/lockdep.h:14,
                    from include/linux/spinlock.h:63,
                    from include/linux/swait.h:7,
                    from include/linux/completion.h:12,
                    from include/linux/crypto.h:15,
                    from include/crypto/hash.h:12,
                    from fs/btrfs/inode.c:6:
   fs/btrfs/inode.c: In function 'zero_range_folio':
>> include/linux/math.h:15:29: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
    #define __round_mask(x, y) ((__typeof__(x))((y)-1))
                                ^
   include/linux/math.h:35:34: note: in expansion of macro '__round_mask'
    #define round_down(x, y) ((x) & ~__round_mask(x, y))
                                     ^~~~~~~~~~~~
   fs/btrfs/inode.c:4843:16: note: in expansion of macro 'round_down'
     block_start = round_down(clamp_start, block_size);
                   ^~~~~~~~~~
>> include/linux/math.h:15:29: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
    #define __round_mask(x, y) ((__typeof__(x))((y)-1))
                                ^
   include/linux/math.h:25:36: note: in expansion of macro '__round_mask'
    #define round_up(x, y) ((((x)-1) | __round_mask(x, y))+1)
                                       ^~~~~~~~~~~~
   fs/btrfs/inode.c:4844:14: note: in expansion of macro 'round_up'
     block_end = round_up(clamp_end + 1, block_size) - 1;
                 ^~~~~~~~


vim +15 include/linux/math.h

aa6159ab99a9ab Andy Shevchenko 2020-12-15   8  
aa6159ab99a9ab Andy Shevchenko 2020-12-15   9  /*
aa6159ab99a9ab Andy Shevchenko 2020-12-15  10   * This looks more complex than it should be. But we need to
aa6159ab99a9ab Andy Shevchenko 2020-12-15  11   * get the type for the ~ right in round_down (it needs to be
aa6159ab99a9ab Andy Shevchenko 2020-12-15  12   * as wide as the result!), and we want to evaluate the macro
aa6159ab99a9ab Andy Shevchenko 2020-12-15  13   * arguments just once each.
aa6159ab99a9ab Andy Shevchenko 2020-12-15  14   */
aa6159ab99a9ab Andy Shevchenko 2020-12-15 @15  #define __round_mask(x, y) ((__typeof__(x))((y)-1))
aa6159ab99a9ab Andy Shevchenko 2020-12-15  16  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-11  5:14 ` [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases Qu Wenruo
  2025-04-12  5:12   ` kernel test robot
@ 2025-04-12  5:54   ` kernel test robot
  2025-04-12 18:35   ` Andy Shevchenko
  2 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2025-04-12  5:54 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: oe-kbuild-all

Hi Qu,

kernel test robot noticed the following build warnings:

[auto build test WARNING on kdave/for-next]
[also build test WARNING on linus/master v6.15-rc1 next-20250411]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-make-btrfs_truncate_block-to-zero-involved-blocks-in-a-folio/20250411-131525
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link:    https://lore.kernel.org/r/d66c922e591b3a57a230ca357b9085fe6ae53812.1744344865.git.wqu%40suse.com
patch subject: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
config: x86_64-buildonly-randconfig-001-20250412 (https://download.01.org/0day-ci/archive/20250412/202504121352.3gzerLun-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250412/202504121352.3gzerLun-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202504121352.3gzerLun-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> vmlinux.o: warning: objtool: zero_range_folio+0x21f: relocation to !ENDBR: bd_may_claim+0x112

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-11  5:14 ` [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases Qu Wenruo
  2025-04-12  5:12   ` kernel test robot
  2025-04-12  5:54   ` kernel test robot
@ 2025-04-12 18:35   ` Andy Shevchenko
  2025-04-14  1:20     ` Qu Wenruo
  2 siblings, 1 reply; 15+ messages in thread
From: Andy Shevchenko @ 2025-04-12 18:35 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:
> [BUG]
> For the following fsx -e 1 run, the btrfs still fails the run on 64K
> page size with 4K fs block size:
> 
> READ BAD DATA: offset = 0x26b3a, size = 0xfafa, fname = /mnt/btrfs/junk
> OFFSET      GOOD    BAD     RANGE
> 0x26b3a     0x0000  0x15b4  0x0
> operation# (mod 256) for the bad data may be 21

[...]

> +static int zero_range_folio(struct btrfs_inode *inode, loff_t from, loff_t end,
> +			    u64 orig_start, u64 orig_end,
> +			    enum btrfs_truncate_where where)
> +{
> +	const u32 blocksize = inode->root->fs_info->sectorsize;
> +	struct address_space *mapping = inode->vfs_inode.i_mapping;
> +	struct extent_io_tree *io_tree = &inode->io_tree;
> +	struct extent_state *cached_state = NULL;
> +	struct btrfs_ordered_extent *ordered;
> +	pgoff_t index = (where == BTRFS_TRUNCATE_HEAD_BLOCK) ?
> +			(from >> PAGE_SHIFT) : (end >> PAGE_SHIFT);

You want to use PFN_*() macros from the pfn.h perhaps?

> +	struct folio *folio;
> +	u64 block_start;
> +	u64 block_end;
> +	u64 clamp_start;
> +	u64 clamp_end;
> +	int ret = 0;
> +
> +	/*
> +	 * The target head/tail block is already block aligned.
> +	 * If block size >= PAGE_SIZE, meaning it's impossible to mmap a
> +	 * page containing anything other than the target block.
> +	 */
> +	if (blocksize >= PAGE_SIZE)
> +		return 0;
> +again:
> +	folio = filemap_lock_folio(mapping, index);
> +	/* No folio present. */
> +	if (IS_ERR(folio))
> +		return 0;
> +
> +	if (!folio_test_uptodate(folio)) {
> +		ret = btrfs_read_folio(NULL, folio);
> +		folio_lock(folio);
> +		if (folio->mapping != mapping) {
> +			folio_unlock(folio);
> +			folio_put(folio);
> +			goto again;
> +		}
> +		if (!folio_test_uptodate(folio)) {
> +			ret = -EIO;
> +			goto out_unlock;
> +		}
> +	}
> +	folio_wait_writeback(folio);
> +
> +	clamp_start = max_t(u64, folio_pos(folio), orig_start);
> +	clamp_end = min_t(u64, folio_pos(folio) + folio_size(folio) - 1,
> +			  orig_end);

You probably wanted clamp() ?

> +	block_start = round_down(clamp_start, block_size);
> +	block_end = round_up(clamp_end + 1, block_size) - 1;

LKP rightfully complains, I believe you want to use ALIGN*() macros instead.

> +	lock_extent(io_tree, block_start, block_end, &cached_state);
> +	ordered = btrfs_lookup_ordered_range(inode, block_start, block_end + 1 - block_end);
> +	if (ordered) {
> +		unlock_extent(io_tree, block_start, block_end, &cached_state);
> +		folio_unlock(folio);
> +		folio_put(folio);
> +		btrfs_start_ordered_extent(ordered);
> +		btrfs_put_ordered_extent(ordered);
> +		goto again;
> +	}
> +	folio_zero_range(folio, clamp_start - folio_pos(folio),
> +			 clamp_end - clamp_start + 1);
> +	unlock_extent(io_tree, block_start, block_end, &cached_state);
> +
> +out_unlock:
> +	folio_unlock(folio);
> +	folio_put(folio);
> +	return ret;
> +}

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-12 18:35   ` Andy Shevchenko
@ 2025-04-14  1:20     ` Qu Wenruo
  2025-04-14  3:37       ` Qu Wenruo
  2025-04-14 10:40       ` Andy Shevchenko
  0 siblings, 2 replies; 15+ messages in thread
From: Qu Wenruo @ 2025-04-14  1:20 UTC (permalink / raw)
  To: Andy Shevchenko, Qu Wenruo; +Cc: linux-btrfs



在 2025/4/13 04:05, Andy Shevchenko 写道:
> Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:
>> [BUG]
>> For the following fsx -e 1 run, the btrfs still fails the run on 64K
>> page size with 4K fs block size:
>>
>> READ BAD DATA: offset = 0x26b3a, size = 0xfafa, fname = /mnt/btrfs/junk
>> OFFSET      GOOD    BAD     RANGE
>> 0x26b3a     0x0000  0x15b4  0x0
>> operation# (mod 256) for the bad data may be 21
>
> [...]
>
>> +static int zero_range_folio(struct btrfs_inode *inode, loff_t from, loff_t end,
>> +			    u64 orig_start, u64 orig_end,
>> +			    enum btrfs_truncate_where where)
>> +{
>> +	const u32 blocksize = inode->root->fs_info->sectorsize;
>> +	struct address_space *mapping = inode->vfs_inode.i_mapping;
>> +	struct extent_io_tree *io_tree = &inode->io_tree;
>> +	struct extent_state *cached_state = NULL;
>> +	struct btrfs_ordered_extent *ordered;
>> +	pgoff_t index = (where == BTRFS_TRUNCATE_HEAD_BLOCK) ?
>> +			(from >> PAGE_SHIFT) : (end >> PAGE_SHIFT);
>
> You want to use PFN_*() macros from the pfn.h perhaps?
>
>> +	struct folio *folio;
>> +	u64 block_start;
>> +	u64 block_end;
>> +	u64 clamp_start;
>> +	u64 clamp_end;
>> +	int ret = 0;
>> +
>> +	/*
>> +	 * The target head/tail block is already block aligned.
>> +	 * If block size >= PAGE_SIZE, meaning it's impossible to mmap a
>> +	 * page containing anything other than the target block.
>> +	 */
>> +	if (blocksize >= PAGE_SIZE)
>> +		return 0;
>> +again:
>> +	folio = filemap_lock_folio(mapping, index);
>> +	/* No folio present. */
>> +	if (IS_ERR(folio))
>> +		return 0;
>> +
>> +	if (!folio_test_uptodate(folio)) {
>> +		ret = btrfs_read_folio(NULL, folio);
>> +		folio_lock(folio);
>> +		if (folio->mapping != mapping) {
>> +			folio_unlock(folio);
>> +			folio_put(folio);
>> +			goto again;
>> +		}
>> +		if (!folio_test_uptodate(folio)) {
>> +			ret = -EIO;
>> +			goto out_unlock;
>> +		}
>> +	}
>> +	folio_wait_writeback(folio);
>> +
>> +	clamp_start = max_t(u64, folio_pos(folio), orig_start);
>> +	clamp_end = min_t(u64, folio_pos(folio) + folio_size(folio) - 1,
>> +			  orig_end);
>
> You probably wanted clamp() ?

Thanks a lot for the help!

It's way more readable than the open-coded one.

>
>> +	block_start = round_down(clamp_start, block_size);
>> +	block_end = round_up(clamp_end + 1, block_size) - 1;
>
> LKP rightfully complains, I believe you want to use ALIGN*() macros instead.

Personally speaking I really want to explicitly show whether it's
rounding up or down.

And unfortunately the ALIGN() itself doesn't show that (meanwhile the
ALIGN_DOWN() is pretty fine).

Can I just do a forced conversion on the @blocksize to fix the warning?

Thanks,
Qu

>
>> +	lock_extent(io_tree, block_start, block_end, &cached_state);
>> +	ordered = btrfs_lookup_ordered_range(inode, block_start, block_end + 1 - block_end);
>> +	if (ordered) {
>> +		unlock_extent(io_tree, block_start, block_end, &cached_state);
>> +		folio_unlock(folio);
>> +		folio_put(folio);
>> +		btrfs_start_ordered_extent(ordered);
>> +		btrfs_put_ordered_extent(ordered);
>> +		goto again;
>> +	}
>> +	folio_zero_range(folio, clamp_start - folio_pos(folio),
>> +			 clamp_end - clamp_start + 1);
>> +	unlock_extent(io_tree, block_start, block_end, &cached_state);
>> +
>> +out_unlock:
>> +	folio_unlock(folio);
>> +	folio_put(folio);
>> +	return ret;
>> +}
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-14  1:20     ` Qu Wenruo
@ 2025-04-14  3:37       ` Qu Wenruo
  2025-04-14 10:40       ` Andy Shevchenko
  1 sibling, 0 replies; 15+ messages in thread
From: Qu Wenruo @ 2025-04-14  3:37 UTC (permalink / raw)
  To: Andy Shevchenko, Qu Wenruo; +Cc: linux-btrfs



在 2025/4/14 10:50, Qu Wenruo 写道:
>
>
> 在 2025/4/13 04:05, Andy Shevchenko 写道:
>> Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:
>>> [BUG]
>>> For the following fsx -e 1 run, the btrfs still fails the run on 64K
>>> page size with 4K fs block size:
>>>
>>> READ BAD DATA: offset = 0x26b3a, size = 0xfafa, fname = /mnt/btrfs/junk
>>> OFFSET      GOOD    BAD     RANGE
>>> 0x26b3a     0x0000  0x15b4  0x0
>>> operation# (mod 256) for the bad data may be 21
>>
>> [...]
>>
>>> +static int zero_range_folio(struct btrfs_inode *inode, loff_t from,
>>> loff_t end,
>>> +                u64 orig_start, u64 orig_end,
>>> +                enum btrfs_truncate_where where)
>>> +{
>>> +    const u32 blocksize = inode->root->fs_info->sectorsize;
>>> +    struct address_space *mapping = inode->vfs_inode.i_mapping;
>>> +    struct extent_io_tree *io_tree = &inode->io_tree;
>>> +    struct extent_state *cached_state = NULL;
>>> +    struct btrfs_ordered_extent *ordered;
>>> +    pgoff_t index = (where == BTRFS_TRUNCATE_HEAD_BLOCK) ?
>>> +            (from >> PAGE_SHIFT) : (end >> PAGE_SHIFT);
>>
>> You want to use PFN_*() macros from the pfn.h perhaps?
>>
>>> +    struct folio *folio;
>>> +    u64 block_start;
>>> +    u64 block_end;
>>> +    u64 clamp_start;
>>> +    u64 clamp_end;
>>> +    int ret = 0;
>>> +
>>> +    /*
>>> +     * The target head/tail block is already block aligned.
>>> +     * If block size >= PAGE_SIZE, meaning it's impossible to mmap a
>>> +     * page containing anything other than the target block.
>>> +     */
>>> +    if (blocksize >= PAGE_SIZE)
>>> +        return 0;
>>> +again:
>>> +    folio = filemap_lock_folio(mapping, index);
>>> +    /* No folio present. */
>>> +    if (IS_ERR(folio))
>>> +        return 0;
>>> +
>>> +    if (!folio_test_uptodate(folio)) {
>>> +        ret = btrfs_read_folio(NULL, folio);
>>> +        folio_lock(folio);
>>> +        if (folio->mapping != mapping) {
>>> +            folio_unlock(folio);
>>> +            folio_put(folio);
>>> +            goto again;
>>> +        }
>>> +        if (!folio_test_uptodate(folio)) {
>>> +            ret = -EIO;
>>> +            goto out_unlock;
>>> +        }
>>> +    }
>>> +    folio_wait_writeback(folio);
>>> +
>>> +    clamp_start = max_t(u64, folio_pos(folio), orig_start);
>>> +    clamp_end = min_t(u64, folio_pos(folio) + folio_size(folio) - 1,
>>> +              orig_end);
>>
>> You probably wanted clamp() ?
>
> Thanks a lot for the help!
>
> It's way more readable than the open-coded one.
>
>>
>>> +    block_start = round_down(clamp_start, block_size);
>>> +    block_end = round_up(clamp_end + 1, block_size) - 1;
>>
>> LKP rightfully complains, I believe you want to use ALIGN*() macros
>> instead.
>
> Personally speaking I really want to explicitly show whether it's
> rounding up or down.
>
> And unfortunately the ALIGN() itself doesn't show that (meanwhile the
> ALIGN_DOWN() is pretty fine).
>
> Can I just do a forced conversion on the @blocksize to fix the warning?

Nevermind, it's a typo, it should be "blocksize" not "block_size", the
latter is a different function defined in blkdev.h.

Thanks,
Qu

>
> Thanks,
> Qu
>
>>
>>> +    lock_extent(io_tree, block_start, block_end, &cached_state);
>>> +    ordered = btrfs_lookup_ordered_range(inode, block_start,
>>> block_end + 1 - block_end);
>>> +    if (ordered) {
>>> +        unlock_extent(io_tree, block_start, block_end, &cached_state);
>>> +        folio_unlock(folio);
>>> +        folio_put(folio);
>>> +        btrfs_start_ordered_extent(ordered);
>>> +        btrfs_put_ordered_extent(ordered);
>>> +        goto again;
>>> +    }
>>> +    folio_zero_range(folio, clamp_start - folio_pos(folio),
>>> +             clamp_end - clamp_start + 1);
>>> +    unlock_extent(io_tree, block_start, block_end, &cached_state);
>>> +
>>> +out_unlock:
>>> +    folio_unlock(folio);
>>> +    folio_put(folio);
>>> +    return ret;
>>> +}
>>
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-14  1:20     ` Qu Wenruo
  2025-04-14  3:37       ` Qu Wenruo
@ 2025-04-14 10:40       ` Andy Shevchenko
  2025-04-15 18:18         ` David Sterba
  1 sibling, 1 reply; 15+ messages in thread
From: Andy Shevchenko @ 2025-04-14 10:40 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Qu Wenruo, linux-btrfs

On Mon, Apr 14, 2025 at 4:20 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> 在 2025/4/13 04:05, Andy Shevchenko 写道:
> > Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:

[...]

> >> +    block_start = round_down(clamp_start, block_size);
> >> +    block_end = round_up(clamp_end + 1, block_size) - 1;
> >
> > LKP rightfully complains, I believe you want to use ALIGN*() macros instead.
>
> Personally speaking I really want to explicitly show whether it's
> rounding up or down.
>
> And unfortunately the ALIGN() itself doesn't show that (meanwhile the
> ALIGN_DOWN() is pretty fine).
>
> Can I just do a forced conversion on the @blocksize to fix the warning?

ALIGN*() are for pointers, the round_*() are for integers. So, please
use ALIGN*().

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-14 10:40       ` Andy Shevchenko
@ 2025-04-15 18:18         ` David Sterba
  2025-04-15 18:21           ` Andy Shevchenko
  0 siblings, 1 reply; 15+ messages in thread
From: David Sterba @ 2025-04-15 18:18 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: Qu Wenruo, Qu Wenruo, linux-btrfs

On Mon, Apr 14, 2025 at 01:40:11PM +0300, Andy Shevchenko wrote:
> On Mon, Apr 14, 2025 at 4:20 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> > 在 2025/4/13 04:05, Andy Shevchenko 写道:
> > > Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:
> 
> [...]
> 
> > >> +    block_start = round_down(clamp_start, block_size);
> > >> +    block_end = round_up(clamp_end + 1, block_size) - 1;
> > >
> > > LKP rightfully complains, I believe you want to use ALIGN*() macros instead.
> >
> > Personally speaking I really want to explicitly show whether it's
> > rounding up or down.
> >
> > And unfortunately the ALIGN() itself doesn't show that (meanwhile the
> > ALIGN_DOWN() is pretty fine).
> >
> > Can I just do a forced conversion on the @blocksize to fix the warning?
> 
> ALIGN*() are for pointers, the round_*() are for integers. So, please
> use ALIGN*().

clamp_start and blocksize are integers and there's a lot of use of ALIGN
with integers too. There's no documentation saying it should be used for
pointers, I can see PTR_ALIGN that does the explicit cast to unsigned
logn and then passes it to ALIGN (as integer).

Historically in the btrfs code the use of ALIGN and round_* is basically
50/50 so we don't have a consistent style, although we'd like to. As the
round_up and round_down are clear I'd rather keep using them in new
code.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-15 18:18         ` David Sterba
@ 2025-04-15 18:21           ` Andy Shevchenko
  2025-04-15 23:57             ` Qu Wenruo
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Shevchenko @ 2025-04-15 18:21 UTC (permalink / raw)
  To: dsterba; +Cc: Qu Wenruo, Qu Wenruo, linux-btrfs

On Tue, Apr 15, 2025 at 9:18 PM David Sterba <dsterba@suse.cz> wrote:
> On Mon, Apr 14, 2025 at 01:40:11PM +0300, Andy Shevchenko wrote:
> > On Mon, Apr 14, 2025 at 4:20 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> > > 在 2025/4/13 04:05, Andy Shevchenko 写道:
> > > > Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:

[...]

> > > >> +    block_start = round_down(clamp_start, block_size);
> > > >> +    block_end = round_up(clamp_end + 1, block_size) - 1;
> > > >
> > > > LKP rightfully complains, I believe you want to use ALIGN*() macros instead.
> > >
> > > Personally speaking I really want to explicitly show whether it's
> > > rounding up or down.
> > >
> > > And unfortunately the ALIGN() itself doesn't show that (meanwhile the
> > > ALIGN_DOWN() is pretty fine).
> > >
> > > Can I just do a forced conversion on the @blocksize to fix the warning?
> >
> > ALIGN*() are for pointers, the round_*() are for integers. So, please
> > use ALIGN*().
>
> clamp_start and blocksize are integers and there's a lot of use of ALIGN
> with integers too. There's no documentation saying it should be used for
> pointers, I can see PTR_ALIGN that does the explicit cast to unsigned
> logn and then passes it to ALIGN (as integer).

Yes, because the unsigned long is natural holder for the addresses and
due to some APIs use it instead of pointers (for whatever reasons) the
PTR_ALIGN() does that. But you see the difference? round_*() expect
_the same_ types of the arguments, while ALIGN*() do not. That is what
makes it so.

> Historically in the btrfs code the use of ALIGN and round_* is basically
> 50/50 so we don't have a consistent style, although we'd like to. As the
> round_up and round_down are clear I'd rather keep using them in new
> code.

And how do you suggest avoiding the warning, please?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-15 18:21           ` Andy Shevchenko
@ 2025-04-15 23:57             ` Qu Wenruo
  2025-04-16  5:57               ` Andy Shevchenko
  0 siblings, 1 reply; 15+ messages in thread
From: Qu Wenruo @ 2025-04-15 23:57 UTC (permalink / raw)
  To: Andy Shevchenko, dsterba; +Cc: Qu Wenruo, linux-btrfs



在 2025/4/16 03:51, Andy Shevchenko 写道:
> On Tue, Apr 15, 2025 at 9:18 PM David Sterba <dsterba@suse.cz> wrote:
>> On Mon, Apr 14, 2025 at 01:40:11PM +0300, Andy Shevchenko wrote:
>>> On Mon, Apr 14, 2025 at 4:20 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>>> 在 2025/4/13 04:05, Andy Shevchenko 写道:
>>>>> Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:
> 
> [...]
> 
>>>>>> +    block_start = round_down(clamp_start, block_size);
>>>>>> +    block_end = round_up(clamp_end + 1, block_size) - 1;
>>>>>
>>>>> LKP rightfully complains, I believe you want to use ALIGN*() macros instead.
>>>>
>>>> Personally speaking I really want to explicitly show whether it's
>>>> rounding up or down.
>>>>
>>>> And unfortunately the ALIGN() itself doesn't show that (meanwhile the
>>>> ALIGN_DOWN() is pretty fine).
>>>>
>>>> Can I just do a forced conversion on the @blocksize to fix the warning?
>>>
>>> ALIGN*() are for pointers, the round_*() are for integers. So, please
>>> use ALIGN*().
>>
>> clamp_start and blocksize are integers and there's a lot of use of ALIGN
>> with integers too. There's no documentation saying it should be used for
>> pointers, I can see PTR_ALIGN that does the explicit cast to unsigned
>> logn and then passes it to ALIGN (as integer).
> 
> Yes, because the unsigned long is natural holder for the addresses and
> due to some APIs use it instead of pointers (for whatever reasons) the
> PTR_ALIGN() does that. But you see the difference? round_*() expect
> _the same_ types of the arguments, while ALIGN*() do not. That is what
> makes it so.
> 
>> Historically in the btrfs code the use of ALIGN and round_* is basically
>> 50/50 so we don't have a consistent style, although we'd like to. As the
>> round_up and round_down are clear I'd rather keep using them in new
>> code.
> 
> And how do you suggest avoiding the warning, please?

By fixing the typo, @block_size -> @blocksize.

The original warning is not about the type difference, but that 
@block_size is a function pointer.

We have tons of round_down()/round_up() usage inside btrfs, with 
different types.

E.g. btrfs_check_data_free_space(), which is calling 
round_down()/round_up() against u64 and u32, and do you got any warnings?

Thanks,
Qu


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-15 23:57             ` Qu Wenruo
@ 2025-04-16  5:57               ` Andy Shevchenko
  2025-04-16  8:28                 ` David Sterba
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Shevchenko @ 2025-04-16  5:57 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Qu Wenruo, linux-btrfs

On Wed, Apr 16, 2025 at 2:57 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> 在 2025/4/16 03:51, Andy Shevchenko 写道:
> > On Tue, Apr 15, 2025 at 9:18 PM David Sterba <dsterba@suse.cz> wrote:
> >> On Mon, Apr 14, 2025 at 01:40:11PM +0300, Andy Shevchenko wrote:
> >>> On Mon, Apr 14, 2025 at 4:20 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> >>>> 在 2025/4/13 04:05, Andy Shevchenko 写道:
> >>>>> Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:

[...]

> >>>>>> +    block_start = round_down(clamp_start, block_size);
> >>>>>> +    block_end = round_up(clamp_end + 1, block_size) - 1;
> >>>>>
> >>>>> LKP rightfully complains, I believe you want to use ALIGN*() macros instead.
> >>>>
> >>>> Personally speaking I really want to explicitly show whether it's
> >>>> rounding up or down.
> >>>>
> >>>> And unfortunately the ALIGN() itself doesn't show that (meanwhile the
> >>>> ALIGN_DOWN() is pretty fine).
> >>>>
> >>>> Can I just do a forced conversion on the @blocksize to fix the warning?
> >>>
> >>> ALIGN*() are for pointers, the round_*() are for integers. So, please
> >>> use ALIGN*().
> >>
> >> clamp_start and blocksize are integers and there's a lot of use of ALIGN
> >> with integers too. There's no documentation saying it should be used for
> >> pointers, I can see PTR_ALIGN that does the explicit cast to unsigned
> >> logn and then passes it to ALIGN (as integer).
> >
> > Yes, because the unsigned long is natural holder for the addresses and
> > due to some APIs use it instead of pointers (for whatever reasons) the
> > PTR_ALIGN() does that. But you see the difference? round_*() expect
> > _the same_ types of the arguments, while ALIGN*() do not. That is what
> > makes it so.
> >
> >> Historically in the btrfs code the use of ALIGN and round_* is basically
> >> 50/50 so we don't have a consistent style, although we'd like to. As the
> >> round_up and round_down are clear I'd rather keep using them in new
> >> code.
> >
> > And how do you suggest avoiding the warning, please?
>
> By fixing the typo, @block_size -> @blocksize.

Ah, if it's that simple, of course, round_*() is okay to go.
My only worries are about explicit castings to "fix" such a warning.

> The original warning is not about the type difference, but that
> @block_size is a function pointer.
>
> We have tons of round_down()/round_up() usage inside btrfs, with
> different types.
>
> E.g. btrfs_check_data_free_space(), which is calling
> round_down()/round_up() against u64 and u32, and do you got any warnings?


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases
  2025-04-16  5:57               ` Andy Shevchenko
@ 2025-04-16  8:28                 ` David Sterba
  0 siblings, 0 replies; 15+ messages in thread
From: David Sterba @ 2025-04-16  8:28 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: Qu Wenruo, Qu Wenruo, linux-btrfs

On Wed, Apr 16, 2025 at 08:57:40AM +0300, Andy Shevchenko wrote:
> On Wed, Apr 16, 2025 at 2:57 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> > 在 2025/4/16 03:51, Andy Shevchenko 写道:
> > > On Tue, Apr 15, 2025 at 9:18 PM David Sterba <dsterba@suse.cz> wrote:
> > >> On Mon, Apr 14, 2025 at 01:40:11PM +0300, Andy Shevchenko wrote:
> > >>> On Mon, Apr 14, 2025 at 4:20 AM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> > >>>> 在 2025/4/13 04:05, Andy Shevchenko 写道:
> > >>>>> Fri, Apr 11, 2025 at 02:44:01PM +0930, Qu Wenruo kirjoitti:
> 
> [...]
> 
> > >>>>>> +    block_start = round_down(clamp_start, block_size);
> > >>>>>> +    block_end = round_up(clamp_end + 1, block_size) - 1;
> > >>>>>
> > >>>>> LKP rightfully complains, I believe you want to use ALIGN*() macros instead.
> > >>>>
> > >>>> Personally speaking I really want to explicitly show whether it's
> > >>>> rounding up or down.
> > >>>>
> > >>>> And unfortunately the ALIGN() itself doesn't show that (meanwhile the
> > >>>> ALIGN_DOWN() is pretty fine).
> > >>>>
> > >>>> Can I just do a forced conversion on the @blocksize to fix the warning?
> > >>>
> > >>> ALIGN*() are for pointers, the round_*() are for integers. So, please
> > >>> use ALIGN*().
> > >>
> > >> clamp_start and blocksize are integers and there's a lot of use of ALIGN
> > >> with integers too. There's no documentation saying it should be used for
> > >> pointers, I can see PTR_ALIGN that does the explicit cast to unsigned
> > >> logn and then passes it to ALIGN (as integer).
> > >
> > > Yes, because the unsigned long is natural holder for the addresses and
> > > due to some APIs use it instead of pointers (for whatever reasons) the
> > > PTR_ALIGN() does that. But you see the difference? round_*() expect
> > > _the same_ types of the arguments, while ALIGN*() do not. That is what
> > > makes it so.
> > >
> > >> Historically in the btrfs code the use of ALIGN and round_* is basically
> > >> 50/50 so we don't have a consistent style, although we'd like to. As the
> > >> round_up and round_down are clear I'd rather keep using them in new
> > >> code.
> > >
> > > And how do you suggest avoiding the warning, please?
> >
> > By fixing the typo, @block_size -> @blocksize.
> 
> Ah, if it's that simple, of course, round_*() is okay to go.
> My only worries are about explicit castings to "fix" such a warning.

Both ALIGN and round_* seem to be fine with different types, there are
the tricks with masking lower bits and the alignment is explicitly cast
to the target type. Most offten we have u64 as target type and u32 as
the alignment.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-04-16  8:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-11  5:13 [PATCH 0/2] btrfs: fix corner cases for subpage generic/363 failures Qu Wenruo
2025-04-11  5:14 ` [PATCH 1/2] btrfs: make btrfs_truncate_block() to zero involved blocks in a folio Qu Wenruo
2025-04-11  7:02   ` Qu Wenruo
2025-04-11  5:14 ` [PATCH 2/2] btrfs: make btrfs_truncate_block() zero folio range for certain subpage corner cases Qu Wenruo
2025-04-12  5:12   ` kernel test robot
2025-04-12  5:54   ` kernel test robot
2025-04-12 18:35   ` Andy Shevchenko
2025-04-14  1:20     ` Qu Wenruo
2025-04-14  3:37       ` Qu Wenruo
2025-04-14 10:40       ` Andy Shevchenko
2025-04-15 18:18         ` David Sterba
2025-04-15 18:21           ` Andy Shevchenko
2025-04-15 23:57             ` Qu Wenruo
2025-04-16  5:57               ` Andy Shevchenko
2025-04-16  8:28                 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox