public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] btrfs: fix all bugs introduced in the compressed_folios[] removal
@ 2026-02-18 10:06 Qu Wenruo
  2026-02-18 10:06 ` [PATCH 1/4] btrfs: fix a bug that makes encoded write bio larger than expected Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Qu Wenruo @ 2026-02-18 10:06 UTC (permalink / raw)
  To: linux-btrfs

I'm a total idiot, that my usual 64K page sized arm64 VM is configured
back to use 4K page size for bs > ps runs, but forgot to revert the
config back.

This makes all recent arm64 runs use 4K page size, making it almost
no different than x86_64 runs.

This means the recent compressed_folios[] cleanup is not really properly
tested for bs < ps cases at all.
Thus all the regressions are not properly detected during the
development.

In fact commit e1bc83f8b157 ("btrfs: get rid of compressed_folios[] usage
for encoded writes") introduced two bugs in just one go, one can even
lead to data corruption for bs < ps cases.

All bugs are caught by dded ASSERT()s, but some ASSERT()s are just
incorrect in the first place, like patch 2~4.

Meanwhile the first one can lead to data corruption if
CONFIG_BTRFS_ASSERT is not selected, thus it will need higher priority.

Again, very sorry for my super stupid arm64 kernel config error.

Will no longer run 4K page sized kernel on that VM any more, and deploy
a new VM for 4K page sized tests, with proper kernel string suffix to
indicate the page size in the future.

Qu Wenruo (4):
  btrfs: fix a bug that makes encoded write bio larger than expected
  btrfs: do not touch page cache for encoded writes
  btrfs: fix an incorrect ASSERT() condition inside
    zstd_decompress_bio()
  btrfs: fix an incorrect ASSERT() condition inside lzo_decompress_bio()

 fs/btrfs/compression.c | 11 ++++++++---
 fs/btrfs/inode.c       |  4 +---
 fs/btrfs/lzo.c         |  4 ++--
 fs/btrfs/zstd.c        |  2 +-
 4 files changed, 12 insertions(+), 9 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] btrfs: fix a bug that makes encoded write bio larger than expected
  2026-02-18 10:06 [PATCH 0/4] btrfs: fix all bugs introduced in the compressed_folios[] removal Qu Wenruo
@ 2026-02-18 10:06 ` Qu Wenruo
  2026-02-18 10:06 ` [PATCH 2/4] btrfs: do not touch page cache for encoded writes Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2026-02-18 10:06 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
When running btrfs/284 with 64K page size and 4K fs block size, the
following ASSERT() can be triggered:

 assertion failed: cb->bbio.bio.bi_iter.bi_size == disk_num_bytes :: 0, in inode.c:9991
 ------------[ cut here ]------------
 kernel BUG at inode.c:9991!
 Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
 CPU: 5 UID: 0 PID: 6787 Comm: btrfs Tainted: G           OE       6.19.0-rc8-custom+ #1 PREEMPT(voluntary)
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 pc : btrfs_do_encoded_write+0x9b0/0x9c0 [btrfs]
 lr : btrfs_do_encoded_write+0x9b0/0x9c0 [btrfs]
 Call trace:
  btrfs_do_encoded_write+0x9b0/0x9c0 [btrfs] (P)
  btrfs_do_write_iter+0x1d8/0x208 [btrfs]
  btrfs_ioctl_encoded_write+0x3c8/0x6d0 [btrfs]
  btrfs_ioctl+0xeb0/0x2b60 [btrfs]
  __arm64_sys_ioctl+0xac/0x110
  invoke_syscall.constprop.0+0x64/0xe8
  el0_svc_common.constprop.0+0x40/0xe8
  do_el0_svc+0x24/0x38
  el0_svc+0x3c/0x1b8
  el0t_64_sync_handler+0xa0/0xe8
  el0t_64_sync+0x1a4/0x1a8
 Code: 91180021 90001080 9111a000 94039d54 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
After commit e1bc83f8b157 ("btrfs: get rid of compressed_folios[] usage
for encoded writes"), the encoded write is changed to copy the content
from the iov into a folio, and queue the folio into the compressed bio.

However we always queue the full folio into the compressed bio, which
can make the compressed bio larger than the on-disk extent, if the folio
size is larger than the fs block size.

Although we have an ASSERT() to catch such problem, for kernels without
CONFIG_BTRFS_ASSERT, such larger than expected bio will just be
submitted, possibly overwrite the next data extent, causing data
corruption.

[FIX]
Instead of blindly queuing the full folio into the compressed bio, only
queue the needed byte range, which is the old behavior before that
offending commit.
This also means we no longer need to zero the tailing range, as such
range will not be written to disk anyway.

And since we're here, add a final ASSERT() into
btrfs_submit_compressed_write() as the last safenet for debug kernels.

Fixes: e1bc83f8b157 ("btrfs: get rid of compressed_folios[] usage for encoded writes")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/compression.c | 1 +
 fs/btrfs/inode.c       | 4 +---
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 1938d33ab57a..348551ab2c04 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -324,6 +324,7 @@ void btrfs_submit_compressed_write(struct btrfs_ordered_extent *ordered,
 
 	cb->start = ordered->file_offset;
 	cb->len = ordered->num_bytes;
+	ASSERT(cb->bbio.bio.bi_iter.bi_size == ordered->disk_num_bytes);
 	cb->compressed_len = ordered->disk_num_bytes;
 	cb->bbio.bio.bi_iter.bi_sector = ordered->disk_bytenr >> SECTOR_SHIFT;
 	cb->bbio.ordered = ordered;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index baf400847ce8..17911d33da0f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9981,9 +9981,7 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from,
 			ret = -EFAULT;
 			goto out_cb;
 		}
-		if (bytes < min_folio_size)
-			folio_zero_range(folio, bytes, min_folio_size - bytes);
-		ret = bio_add_folio(&cb->bbio.bio, folio, folio_size(folio), 0);
+		ret = bio_add_folio(&cb->bbio.bio, folio, bytes, 0);
 		if (unlikely(!ret)) {
 			folio_put(folio);
 			ret = -EINVAL;
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] btrfs: do not touch page cache for encoded writes
  2026-02-18 10:06 [PATCH 0/4] btrfs: fix all bugs introduced in the compressed_folios[] removal Qu Wenruo
  2026-02-18 10:06 ` [PATCH 1/4] btrfs: fix a bug that makes encoded write bio larger than expected Qu Wenruo
@ 2026-02-18 10:06 ` Qu Wenruo
  2026-02-18 10:06 ` [PATCH 3/4] btrfs: fix an incorrect ASSERT() condition inside zstd_decompress_bio() Qu Wenruo
  2026-02-18 10:06 ` [PATCH 4/4] btrfs: fix an incorrect ASSERT() condition inside lzo_decompress_bio() Qu Wenruo
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2026-02-18 10:06 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
When running btrfs/284, the following ASSERT() will be triggered with
64K page size and 4K fs block size:

 assertion failed: folio_test_writeback(folio) :: 0, in subpage.c:476
 ------------[ cut here ]------------
 kernel BUG at subpage.c:476!
 Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
 CPU: 4 UID: 0 PID: 2313 Comm: kworker/u37:2 Tainted: G           OE       6.19.0-rc8-custom+ #185 PREEMPT(voluntary)
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: btrfs-endio simple_end_io_work [btrfs]
 pc : btrfs_subpage_clear_writeback+0x148/0x160 [btrfs]
 lr : btrfs_subpage_clear_writeback+0x148/0x160 [btrfs]
 Call trace:
  btrfs_subpage_clear_writeback+0x148/0x160 [btrfs] (P)
  btrfs_folio_clamp_clear_writeback+0xb4/0xd0 [btrfs]
  end_compressed_writeback+0xe0/0x1e0 [btrfs]
  end_bbio_compressed_write+0x1e8/0x218 [btrfs]
  btrfs_bio_end_io+0x108/0x258 [btrfs]
  simple_end_io_work+0x68/0xa8 [btrfs]
  process_one_work+0x168/0x3f0
  worker_thread+0x25c/0x398
  kthread+0x154/0x250
  ret_from_fork+0x10/0x20
 ---[ end trace 0000000000000000 ]---

[CAUSE]
The offending bio is from an encoded write, where the compressed data is
directly written as a data extent, without touching the page cache.

However the encoded write still utilizes the regular buffered write path
for compressed data, by setting the compressed_bio::writeback flag.

When that flag is set, at end_bbio_compressed_write() btrfs will go
clearing the writeback flags of the folio in page cache.

However for bs < ps cases, the subpage helper has one extra check to make
sure the folio has writeback flag set in the first place.

But since it's an encoded write, we never go through page
cache, thus the folio has no writeback flag and triggered the ASSERT().

[FIX]
Do not set compressed_bio::writeback flag for encoded writes, and change
the ASSERT() in btrfs_submit_compressed_write() to make sure that flag
is not set.

Fixes: e1bc83f8b157 ("btrfs: get rid of compressed_folios[] usage for encoded writes")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/compression.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 348551ab2c04..64600b6458cb 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -320,7 +320,12 @@ void btrfs_submit_compressed_write(struct btrfs_ordered_extent *ordered,
 
 	ASSERT(IS_ALIGNED(ordered->file_offset, fs_info->sectorsize));
 	ASSERT(IS_ALIGNED(ordered->num_bytes, fs_info->sectorsize));
-	ASSERT(cb->writeback);
+	/*
+	 * This flag determines if we should clear writeback flags
+	 * from page cache. But this function is only utilized by
+	 * encoded write, it never go through page cache.
+	 */
+	ASSERT(!cb->writeback);
 
 	cb->start = ordered->file_offset;
 	cb->len = ordered->num_bytes;
@@ -346,8 +351,7 @@ struct compressed_bio *btrfs_alloc_compressed_write(struct btrfs_inode *inode,
 	cb = alloc_compressed_bio(inode, start, REQ_OP_WRITE, end_bbio_compressed_write);
 	cb->start = start;
 	cb->len = len;
-	cb->writeback = true;
-
+	cb->writeback = false;
 	return cb;
 }
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] btrfs: fix an incorrect ASSERT() condition inside zstd_decompress_bio()
  2026-02-18 10:06 [PATCH 0/4] btrfs: fix all bugs introduced in the compressed_folios[] removal Qu Wenruo
  2026-02-18 10:06 ` [PATCH 1/4] btrfs: fix a bug that makes encoded write bio larger than expected Qu Wenruo
  2026-02-18 10:06 ` [PATCH 2/4] btrfs: do not touch page cache for encoded writes Qu Wenruo
@ 2026-02-18 10:06 ` Qu Wenruo
  2026-02-18 10:06 ` [PATCH 4/4] btrfs: fix an incorrect ASSERT() condition inside lzo_decompress_bio() Qu Wenruo
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2026-02-18 10:06 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
When running btrfs/284 with 64K page size and 4K fs block size, it
crashes with the following ASSERT() triggered:

 assertion failed: folio_size(fi.folio) == blocksize :: 0, in fs/btrfs/zstd.c:603
 ------------[ cut here ]------------
 kernel BUG at fs/btrfs/zstd.c:603!
 Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
 CPU: 2 UID: 0 PID: 1183 Comm: kworker/u35:4 Not tainted 6.19.0-rc8-custom+ #185 PREEMPT(voluntary)
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: btrfs-endio simple_end_io_work [btrfs]
 pc : zstd_decompress_bio+0x4f0/0x508 [btrfs]
 lr : zstd_decompress_bio+0x4f0/0x508 [btrfs]
 Call trace:
  zstd_decompress_bio+0x4f0/0x508 [btrfs] (P)
  end_bbio_compressed_read+0x260/0x2c0 [btrfs]
  btrfs_bio_end_io+0xc4/0x258 [btrfs]
  btrfs_check_read_bio+0x424/0x7e0 [btrfs]
  simple_end_io_work+0x40/0xa8 [btrfs]
  process_one_work+0x168/0x3f0
  worker_thread+0x25c/0x398
  kthread+0x154/0x250
  ret_from_fork+0x10/0x20
 ---[ end trace 0000000000000000 ]---

[CAUSE]
Commit 1914b94231e9 ("btrfs: zstd: use folio_iter to handle
zstd_decompress_bio()") added the ASSERT() to make sure the folio size
match the fs block size.

But the check is completely wrong, the original intention is to make
sure for bs > ps cases, we always got a large folio that covers a full fs
block.

However for bs < ps cases, a folio can never be smaller than page size,
and the ASSERT() gets triggered immediately.

[FIX]
Check the folio size against @min_folio_size instead, which will never
be smaller than PAGE_SIZE, and still cover bs > ps cases.

Fixes: 1914b94231e9 ("btrfs: zstd: use folio_iter to handle zstd_decompress_bio()")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/zstd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 32fd7f5454d3..c002d18666b7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -600,7 +600,7 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 	bio_first_folio(&fi, &cb->bbio.bio, 0);
 	if (unlikely(!fi.folio))
 		return -EINVAL;
-	ASSERT(folio_size(fi.folio) == blocksize);
+	ASSERT(folio_size(fi.folio) == min_folio_size);
 
 	stream = zstd_init_dstream(
 			ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] btrfs: fix an incorrect ASSERT() condition inside lzo_decompress_bio()
  2026-02-18 10:06 [PATCH 0/4] btrfs: fix all bugs introduced in the compressed_folios[] removal Qu Wenruo
                   ` (2 preceding siblings ...)
  2026-02-18 10:06 ` [PATCH 3/4] btrfs: fix an incorrect ASSERT() condition inside zstd_decompress_bio() Qu Wenruo
@ 2026-02-18 10:06 ` Qu Wenruo
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2026-02-18 10:06 UTC (permalink / raw)
  To: linux-btrfs

[BUG]
When running btrfs/284 with 64K page size and 4K fs block size, it
crashes with the following ASSERT() triggered:

 BTRFS info (device dm-3): use lzo compression, level 1
 assertion failed: folio_size(fi.folio) == sectorsize :: 0, in lzo.c:450
 ------------[ cut here ]------------
 kernel BUG at lzo.c:450!
 Internal error: Oops - BUG: 00000000f2000800 [#1]  SMP
 CPU: 4 UID: 0 PID: 329 Comm: kworker/u37:2 Tainted: G           OE       6.19.0-rc8-custom+ #185 PREEMPT(voluntary)
 Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022
 Workqueue: btrfs-endio simple_end_io_work [btrfs]
 pc : lzo_decompress_bio+0x61c/0x630 [btrfs]
 lr : lzo_decompress_bio+0x61c/0x630 [btrfs]
 Call trace:
  lzo_decompress_bio+0x61c/0x630 [btrfs] (P)
  end_bbio_compressed_read+0x2a8/0x2c0 [btrfs]
  btrfs_bio_end_io+0xc4/0x258 [btrfs]
  btrfs_check_read_bio+0x424/0x7e0 [btrfs]
  simple_end_io_work+0x40/0xa8 [btrfs]
  process_one_work+0x168/0x3f0
  worker_thread+0x25c/0x398
  kthread+0x154/0x250
  ret_from_fork+0x10/0x20
 Code: 912a2021 b0000e00 91246000 940244e9 (d4210000)
 ---[ end trace 0000000000000000 ]---

[CAUSE]
Commit 37cc07cab7dc ("btrfs: lzo: use folio_iter to handle
lzo_decompress_bio()") added the ASSERT() to make sure the folio size
match the fs block size.

But the check is completely wrong, the original intention is to make
sure for bs > ps cases, we always got a large folio that covers a full fs
block.

However for bs < ps cases, a folio can never be smaller than page size,
and the ASSERT() gets triggered immediately.

[FIX]
Check the folio size against @min_folio_size instead, which will never
be smaller than PAGE_SIZE, and still cover bs > ps cases.

Fixes: 37cc07cab7dc ("btrfs: lzo: use folio_iter to handle lzo_decompress_bio()")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/lzo.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
index 8e20497afffe..971c2ea98e18 100644
--- a/fs/btrfs/lzo.c
+++ b/fs/btrfs/lzo.c
@@ -429,7 +429,7 @@ static void copy_compressed_segment(struct compressed_bio *cb,
 int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 {
 	struct workspace *workspace = list_entry(ws, struct workspace, list);
-	const struct btrfs_fs_info *fs_info = cb->bbio.inode->root->fs_info;
+	struct btrfs_fs_info *fs_info = cb->bbio.inode->root->fs_info;
 	const u32 sectorsize = fs_info->sectorsize;
 	struct folio_iter fi;
 	char *kaddr;
@@ -447,7 +447,7 @@ int lzo_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 	/* There must be a compressed folio and matches the sectorsize. */
 	if (unlikely(!fi.folio))
 		return -EINVAL;
-	ASSERT(folio_size(fi.folio) == sectorsize);
+	ASSERT(folio_size(fi.folio) == btrfs_min_folio_size(fs_info));
 	kaddr = kmap_local_folio(fi.folio, 0);
 	len_in = read_compress_length(kaddr);
 	kunmap_local(kaddr);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-02-18 10:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-18 10:06 [PATCH 0/4] btrfs: fix all bugs introduced in the compressed_folios[] removal Qu Wenruo
2026-02-18 10:06 ` [PATCH 1/4] btrfs: fix a bug that makes encoded write bio larger than expected Qu Wenruo
2026-02-18 10:06 ` [PATCH 2/4] btrfs: do not touch page cache for encoded writes Qu Wenruo
2026-02-18 10:06 ` [PATCH 3/4] btrfs: fix an incorrect ASSERT() condition inside zstd_decompress_bio() Qu Wenruo
2026-02-18 10:06 ` [PATCH 4/4] btrfs: fix an incorrect ASSERT() condition inside lzo_decompress_bio() Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox