* Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
@ 2026-03-09 16:37 Johannes Thumshirn
2026-03-09 20:56 ` Joanne Koong
0 siblings, 1 reply; 9+ messages in thread
From: Johannes Thumshirn @ 2026-03-09 16:37 UTC (permalink / raw)
To: Joanne Koong; +Cc: linux-fsdevel@vger.kernel.org
Hi Joanne,
After commit aa35dd5cbc06 ("iomap: fix invalid folio access after
folio_end_read()") my zoned btrfs test setup hangs. I've bisected it to
this commit and reverting fixes my problem. The last thing I see in
dmesg is:
[ 9.387175] ------------[ cut here ]------------
[ 9.387320] WARNING: fs/iomap/buffered-io.c:487 at
iomap_read_end+0x11c/0x140, CPU#5: (udev-worker)/463
[ 9.387431] Modules linked in:
[ 9.387502] CPU: 5 UID: 0 PID: 463 Comm: (udev-worker) Not tainted
6.19.0-rc1+ #385 PREEMPT(full)
[ 9.387626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.17.0-9.fc43 06/10/2025
[ 9.387810] RIP: 0010:iomap_read_end+0x11c/0x140
[ 9.387886] Code: 00 48 89 ef 48 3b 04 24 0f 94 04 24 44 0f b6 34 24
e8 b8 88 69 00 48 89 df 48 83 c4 08 41 0f b6 f6 5b 5d 41 5e e9 54 e7 e8
ff <0f> 0b e9 48 ff ff ff ba 00 10 00 00 eb 9d 0f 0b e9 53 ff ff ff 0f
[ 9.388096] RSP: 0018:ffffc90000e279c0 EFLAGS: 00010206
[ 9.388178] RAX: ffff888111c6b800 RBX: ffffea0004363900 RCX:
0000000000000000
[ 9.388281] RDX: 0000000000000000 RSI: 0000000000000400 RDI:
ffffea0004363900
[ 9.388386] RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000001
[ 9.388491] R10: ffffc90000e279a0 R11: ffff888111c6c168 R12:
ffffffff81e5cdb0
[ 9.388585] R13: ffffc90000e27c58 R14: ffffea0004363900 R15:
ffffc90000e27c58
[ 9.388690] FS: 00007f6e03fdfc00(0000) GS:ffff8882b4c13000(0000)
knlGS:0000000000000000
[ 9.388793] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9.388873] CR2: 00007f6e03280000 CR3: 00000001106d7006 CR4:
0000000000770eb0
[ 9.388967] PKRU: 55555554
[ 9.389006] Call Trace:
[ 9.389044] <TASK>
[ 9.389087] iomap_readahead+0x23c/0x2e0
[ 9.389167] blkdev_readahead+0x3d/0x50
[ 9.389222] read_pages+0x56/0x200
[ 9.389277] ? __folio_batch_add_and_move+0x1cf/0x2d0
[ 9.389354] page_cache_ra_unbounded+0x1db/0x2c0
[ 9.389423] force_page_cache_ra+0x96/0xb0
[ 9.389470] filemap_get_pages+0x12f/0x490
[ 9.389532] filemap_read+0xed/0x400
[ 9.389590] ? lock_acquire+0xd5/0x2b0
[ 9.389633] ? blkdev_read_iter+0x6b/0x180
[ 9.389678] ? lock_acquire+0xe5/0x2b0
[ 9.389720] ? lock_is_held_type+0xcd/0x130
[ 9.389761] ? find_held_lock+0x2b/0x80
[ 9.389813] ? lock_acquired+0x1e9/0x3c0
[ 9.389864] blkdev_read_iter+0x79/0x180
[ 9.389911] ? local_clock_noinstr+0x17/0x110
[ 9.389975] vfs_read+0x240/0x340
[ 9.390033] ksys_read+0x61/0xd0
[ 9.390083] do_syscall_64+0x74/0x3a0
[ 9.390143] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 9.390204] RIP: 0033:0x7f6e048c5c5e
[ 9.390255] Code: 4d 89 d8 e8 34 bd 00 00 4c 8b 5d f8 41 8b 93 08 03
00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f
05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa
[ 9.390447] RSP: 002b:00007ffdf50cded0 EFLAGS: 00000202 ORIG_RAX:
0000000000000000
[ 9.390529] RAX: ffffffffffffffda RBX: 000000013aa55200 RCX:
00007f6e048c5c5e
[ 9.390619] RDX: 0000000000000200 RSI: 00007f6e0327f000 RDI:
0000000000000014
[ 9.390704] RBP: 00007ffdf50cdee0 R08: 0000000000000000 R09:
0000000000000000
[ 9.390789] R10: 0000000000000000 R11: 0000000000000202 R12:
0000000000000000
[ 9.390866] R13: 000055d5f99fa270 R14: 000055d5f96e5cb0 R15:
000055d5f96e5cc8
[ 9.390967] </TASK>
[ 9.390997] irq event stamp: 54269
[ 9.391044] hardirqs last enabled at (54279): [<ffffffff8138ac02>]
__up_console_sem+0x52/0x60
[ 9.391141] hardirqs last disabled at (54288): [<ffffffff8138abe7>]
__up_console_sem+0x37/0x60
[ 9.391240] softirqs last enabled at (53796): [<ffffffff81304768>]
irq_exit_rcu+0x78/0x110
[ 9.391327] softirqs last disabled at (53787): [<ffffffff81304768>]
irq_exit_rcu+0x78/0x110
[ 9.391420] ---[ end trace 0000000000000000 ]---
I haven't debugged this further yet, maybe you have an idea what
could've caused it. On my side it's trivial to reproduce, so if you
can't reproduce it just yell.
Byte,
Johannes
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
2026-03-09 16:37 Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") Johannes Thumshirn
@ 2026-03-09 20:56 ` Joanne Koong
2026-03-10 11:44 ` Johannes Thumshirn
0 siblings, 1 reply; 9+ messages in thread
From: Joanne Koong @ 2026-03-09 20:56 UTC (permalink / raw)
To: Johannes Thumshirn; +Cc: linux-fsdevel@vger.kernel.org
On Mon, Mar 9, 2026 at 9:37 AM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
> Hi Joanne,
>
> After commit aa35dd5cbc06 ("iomap: fix invalid folio access after
> folio_end_read()") my zoned btrfs test setup hangs. I've bisected it to
> this commit and reverting fixes my problem. The last thing I see in
> dmesg is:
>
> [ 9.387175] ------------[ cut here ]------------
> [ 9.387320] WARNING: fs/iomap/buffered-io.c:487 at
> iomap_read_end+0x11c/0x140, CPU#5: (udev-worker)/463
> [ 9.387431] Modules linked in:
> [ 9.387502] CPU: 5 UID: 0 PID: 463 Comm: (udev-worker) Not tainted
> 6.19.0-rc1+ #385 PREEMPT(full)
> [ 9.387626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.17.0-9.fc43 06/10/2025
> [ 9.387810] RIP: 0010:iomap_read_end+0x11c/0x140
> [ 9.387886] Code: 00 48 89 ef 48 3b 04 24 0f 94 04 24 44 0f b6 34 24
> e8 b8 88 69 00 48 89 df 48 83 c4 08 41 0f b6 f6 5b 5d 41 5e e9 54 e7 e8
> ff <0f> 0b e9 48 ff ff ff ba 00 10 00 00 eb 9d 0f 0b e9 53 ff ff ff 0f
> [ 9.388096] RSP: 0018:ffffc90000e279c0 EFLAGS: 00010206
> [ 9.388178] RAX: ffff888111c6b800 RBX: ffffea0004363900 RCX:
> 0000000000000000
> [ 9.388281] RDX: 0000000000000000 RSI: 0000000000000400 RDI:
> ffffea0004363900
> [ 9.388386] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000001
> [ 9.388491] R10: ffffc90000e279a0 R11: ffff888111c6c168 R12:
> ffffffff81e5cdb0
> [ 9.388585] R13: ffffc90000e27c58 R14: ffffea0004363900 R15:
> ffffc90000e27c58
> [ 9.388690] FS: 00007f6e03fdfc00(0000) GS:ffff8882b4c13000(0000)
> knlGS:0000000000000000
> [ 9.388793] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 9.388873] CR2: 00007f6e03280000 CR3: 00000001106d7006 CR4:
> 0000000000770eb0
> [ 9.388967] PKRU: 55555554
> [ 9.389006] Call Trace:
> [ 9.389044] <TASK>
> [ 9.389087] iomap_readahead+0x23c/0x2e0
> [ 9.389167] blkdev_readahead+0x3d/0x50
> [ 9.389222] read_pages+0x56/0x200
> [ 9.389277] ? __folio_batch_add_and_move+0x1cf/0x2d0
> [ 9.389354] page_cache_ra_unbounded+0x1db/0x2c0
> [ 9.389423] force_page_cache_ra+0x96/0xb0
> [ 9.389470] filemap_get_pages+0x12f/0x490
> [ 9.389532] filemap_read+0xed/0x400
> [ 9.389590] ? lock_acquire+0xd5/0x2b0
> [ 9.389633] ? blkdev_read_iter+0x6b/0x180
> [ 9.389678] ? lock_acquire+0xe5/0x2b0
> [ 9.389720] ? lock_is_held_type+0xcd/0x130
> [ 9.389761] ? find_held_lock+0x2b/0x80
> [ 9.389813] ? lock_acquired+0x1e9/0x3c0
> [ 9.389864] blkdev_read_iter+0x79/0x180
> [ 9.389911] ? local_clock_noinstr+0x17/0x110
> [ 9.389975] vfs_read+0x240/0x340
> [ 9.390033] ksys_read+0x61/0xd0
> [ 9.390083] do_syscall_64+0x74/0x3a0
> [ 9.390143] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 9.390204] RIP: 0033:0x7f6e048c5c5e
> [ 9.390255] Code: 4d 89 d8 e8 34 bd 00 00 4c 8b 5d f8 41 8b 93 08 03
> 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f
> 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa
> [ 9.390447] RSP: 002b:00007ffdf50cded0 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000000
> [ 9.390529] RAX: ffffffffffffffda RBX: 000000013aa55200 RCX:
> 00007f6e048c5c5e
> [ 9.390619] RDX: 0000000000000200 RSI: 00007f6e0327f000 RDI:
> 0000000000000014
> [ 9.390704] RBP: 00007ffdf50cdee0 R08: 0000000000000000 R09:
> 0000000000000000
> [ 9.390789] R10: 0000000000000000 R11: 0000000000000202 R12:
> 0000000000000000
> [ 9.390866] R13: 000055d5f99fa270 R14: 000055d5f96e5cb0 R15:
> 000055d5f96e5cc8
> [ 9.390967] </TASK>
> [ 9.390997] irq event stamp: 54269
> [ 9.391044] hardirqs last enabled at (54279): [<ffffffff8138ac02>]
> __up_console_sem+0x52/0x60
> [ 9.391141] hardirqs last disabled at (54288): [<ffffffff8138abe7>]
> __up_console_sem+0x37/0x60
> [ 9.391240] softirqs last enabled at (53796): [<ffffffff81304768>]
> irq_exit_rcu+0x78/0x110
> [ 9.391327] softirqs last disabled at (53787): [<ffffffff81304768>]
> irq_exit_rcu+0x78/0x110
> [ 9.391420] ---[ end trace 0000000000000000 ]---
>
> I haven't debugged this further yet, maybe you have an idea what
> could've caused it. On my side it's trivial to reproduce, so if you
> can't reproduce it just yell.
Hi Johannes,
Thanks for your report and for bisecting this.
A few questions:
1. From the stack trace it looks like this is happening during a
block device read triggered by udev-worker's device probing. Does this
trigger for you consistently when the zoned device is probed or only
when running the btrfs xfstests generic/648?
2. I tried to repro your setup by running:
sudo modprobe null_blk nr_devices=2 zoned=1 zone_size=256
zone_nr_conv=8 memory_backed=1
sudo mkdir -p /mnt/test && sudo mkdir -p /mnt/scratch
sudo mkfs.btrfs -f /dev/nullb0
sudo mount /dev/nullb0 /mnt/test
sudo ./check generic/648
with local.config set to:
TEST_DEV=/dev/nullb0
TEST_DIR=/mnt/test
SCRATCH_DEV=/dev/nullb1
SCRATCH_MNT=/mnt/scratch
export FSTYP=btrfs
but I’m not seeing the hang or the WARNING in dmesg show up. I am
running it with PREEMPT(full) enabled. Does this match your setup or
are you doing something differently? Are you using null_blk or an
actual zoned device? If you are using null_blk, what module parameters
are you using?
3. If you're able to repro this consistently, would you be able to add
these lines right above the WARN_ON on line 487 and sharing what it
prints out?
+++ b/fs/iomap/buffered-io.c
@@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio,
size_t bytes_submitted)
* to the IO helper, in which case we are responsible for
* unlocking the folio here.
*/
+ if (bytes_submitted) {
+ struct inode *inode = folio->mapping->host;
+ struct block_device *bdev = inode->i_sb->s_bdev;
+
+ pr_warn("bytes_submitted=%zu folio_size=%zu
blkbits=%u isize=%lld "
+ "logical_bs=%u physical_bs=%u\n",
+ bytes_submitted, folio_size(folio),
inode->i_blkbits,
+ i_size_read(inode),
+ bdev ? bdev_logical_block_size(bdev) : 0,
+ bdev ? bdev_physical_block_size(bdev) : 0);
+ }
WARN_ON_ONCE(bytes_submitted);
folio_unlock(folio);
}
Thanks,
Joanne
>
> Byte,
>
> Johannes
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
2026-03-09 20:56 ` Joanne Koong
@ 2026-03-10 11:44 ` Johannes Thumshirn
2026-03-10 21:08 ` Joanne Koong
0 siblings, 1 reply; 9+ messages in thread
From: Johannes Thumshirn @ 2026-03-10 11:44 UTC (permalink / raw)
To: Joanne Koong; +Cc: linux-fsdevel@vger.kernel.org
On 3/9/26 9:56 PM, Joanne Koong wrote:
> A few questions:
> 1. From the stack trace it looks like this is happening during a
> block device read triggered by udev-worker's device probing. Does this
> trigger for you consistently when the zoned device is probed or only
> when running the btrfs xfstests generic/648?
Its only on g/648. I've also tried zoned XFS but g/648 didn't like my
XFS and skipped the test case.
> 2. I tried to repro your setup by running:
> sudo modprobe null_blk nr_devices=2 zoned=1 zone_size=256
> zone_nr_conv=8 memory_backed=1
> sudo mkdir -p /mnt/test && sudo mkdir -p /mnt/scratch
> sudo mkfs.btrfs -f /dev/nullb0
> sudo mount /dev/nullb0 /mnt/test
> sudo ./check generic/648
>
> with local.config set to:
> TEST_DEV=/dev/nullb0
> TEST_DIR=/mnt/test
> SCRATCH_DEV=/dev/nullb1
> SCRATCH_MNT=/mnt/scratch
> export FSTYP=btrfs
>
> but I’m not seeing the hang or the WARNING in dmesg show up. I am
> running it with PREEMPT(full) enabled. Does this match your setup or
> are you doing something differently? Are you using null_blk or an
> actual zoned device? If you are using null_blk, what module parameters
> are you using?
I'm using a zloop device passed to a VM via virtio-blk. The zloop
configuration is just:
echo "add id=0,zone_size_mb=256,conv_zones=4" > /dev/zloop-control
echo "add id=1,zone_size_mb=256,conv_zones=4" > /dev/zloop-control
The virtio-blk conifg is:
-drive driver=host_device,file=/dev/zloop0,if=virtio,cache.direct=on \
-drive driver=host_device,file=/dev/zloop1,if=virtio,cache.direct=on
The rootfs is on virtio-fs (using virtme-ng).
>
> 3. If you're able to repro this consistently, would you be able to add
> these lines right above the WARN_ON on line 487 and sharing what it
> prints out?
>
> +++ b/fs/iomap/buffered-io.c
> @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio,
> size_t bytes_submitted)
> * to the IO helper, in which case we are responsible for
> * unlocking the folio here.
> */
> + if (bytes_submitted) {
> + struct inode *inode = folio->mapping->host;
> + struct block_device *bdev = inode->i_sb->s_bdev;
> +
> + pr_warn("bytes_submitted=%zu folio_size=%zu
> blkbits=%u isize=%lld "
> + "logical_bs=%u physical_bs=%u\n",
> + bytes_submitted, folio_size(folio),
> inode->i_blkbits,
> + i_size_read(inode),
> + bdev ? bdev_logical_block_size(bdev) : 0,
> + bdev ? bdev_physical_block_size(bdev) : 0);
> + }
> WARN_ON_ONCE(bytes_submitted);
> folio_unlock(folio);
> }
Here we go:
[ 17.872952] bytes_submitted=1024 folio_size=4096 blkbits=12
isize=5278880768 logical_bs=0 physical_bs=0
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
2026-03-10 11:44 ` Johannes Thumshirn
@ 2026-03-10 21:08 ` Joanne Koong
2026-03-10 21:55 ` Joanne Koong
2026-03-10 21:59 ` Darrick J. Wong
0 siblings, 2 replies; 9+ messages in thread
From: Joanne Koong @ 2026-03-10 21:08 UTC (permalink / raw)
To: Johannes Thumshirn; +Cc: linux-fsdevel@vger.kernel.org
On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
>
> >
> > 3. If you're able to repro this consistently, would you be able to add
> > these lines right above the WARN_ON on line 487 and sharing what it
> > prints out?
> >
> > +++ b/fs/iomap/buffered-io.c
> > @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio,
> > size_t bytes_submitted)
> > * to the IO helper, in which case we are responsible for
> > * unlocking the folio here.
> > */
> > + if (bytes_submitted) {
> > + struct inode *inode = folio->mapping->host;
> > + struct block_device *bdev = inode->i_sb->s_bdev;
> > +
> > + pr_warn("bytes_submitted=%zu folio_size=%zu
> > blkbits=%u isize=%lld "
> > + "logical_bs=%u physical_bs=%u\n",
> > + bytes_submitted, folio_size(folio),
> > inode->i_blkbits,
> > + i_size_read(inode),
> > + bdev ? bdev_logical_block_size(bdev) : 0,
> > + bdev ? bdev_physical_block_size(bdev) : 0);
> > + }
> > WARN_ON_ONCE(bytes_submitted);
> > folio_unlock(folio);
> > }
>
> Here we go:
>
> [ 17.872952] bytes_submitted=1024 folio_size=4096 blkbits=12
> isize=5278880768 logical_bs=0 physical_bs=0
Thank you, this is very helpful.
Is it correct that the block device's inode->i_blkbits doesn't reflect
its actual logical block size (512) or is that itself a bug somewhere
in the block layer? On the iomap side, it uses inode->i_blkbits to
determine whether or not an ifs should be attached and what logic to
correspondingly call. For this case, shouldn't the inode's i_blkbits
be 9 since "isize= 5278880768" indicates it's 512-byte aligned, not
4096-byte aligned? I'm not familiar with the block layer, so your
thoughts on this would be appreciated.
If it is fine for the block device's inode->i_blkbits to be a
different value from the logical block size, then I think the iomap
side needs this fix:
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 867e8ac761c8..03a97361570f 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
* helper, then the helper owns the folio and will end
* the read on it.
*/
- if (*bytes_submitted == folio_len)
+ if (*bytes_submitted == folio_len || !ifs)
ctx->cur_folio = NULL;
}
Could you verify that this fixes the hang?
Thanks,
Joanne
>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
2026-03-10 21:08 ` Joanne Koong
@ 2026-03-10 21:55 ` Joanne Koong
2026-03-17 9:34 ` Johannes Thumshirn
2026-03-10 21:59 ` Darrick J. Wong
1 sibling, 1 reply; 9+ messages in thread
From: Joanne Koong @ 2026-03-10 21:55 UTC (permalink / raw)
To: Johannes Thumshirn; +Cc: linux-fsdevel@vger.kernel.org
On Tue, Mar 10, 2026 at 2:08 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn
> <Johannes.Thumshirn@wdc.com> wrote:
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 867e8ac761c8..03a97361570f 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> * helper, then the helper owns the folio and will end
> * the read on it.
> */
> - if (*bytes_submitted == folio_len)
> + if (*bytes_submitted == folio_len || !ifs)
> ctx->cur_folio = NULL;
> }
it should be this instead:
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 867e8ac761c8..b803f518adaf 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -497,6 +497,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
loff_t length = iomap_length(iter);
struct folio *folio = ctx->cur_folio;
size_t folio_len = folio_size(folio);
+ struct iomap_folio_state *ifs;
size_t poff, plen;
loff_t pos_diff;
int ret;
@@ -508,7 +509,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
return iomap_iter_advance(iter, length);
}
- ifs_alloc(iter->inode, folio, iter->flags);
+ ifs = ifs_alloc(iter->inode, folio, iter->flags);
length = min_t(loff_t, length, folio_len - offset_in_folio(folio, pos));
while (length) {
@@ -543,7 +544,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
* helper, then the helper owns the folio and will end
* the read on it.
*/
- if (*bytes_submitted == folio_len)
+ if (*bytes_submitted == folio_len || !ifs)
ctx->cur_folio = NULL;
}
Thanks,
Joanne
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
2026-03-10 21:08 ` Joanne Koong
2026-03-10 21:55 ` Joanne Koong
@ 2026-03-10 21:59 ` Darrick J. Wong
2026-03-17 19:47 ` Joanne Koong
1 sibling, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2026-03-10 21:59 UTC (permalink / raw)
To: Joanne Koong; +Cc: Johannes Thumshirn, linux-fsdevel@vger.kernel.org
On Tue, Mar 10, 2026 at 02:08:28PM -0700, Joanne Koong wrote:
> On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn
> <Johannes.Thumshirn@wdc.com> wrote:
> >
> >
> > >
> > > 3. If you're able to repro this consistently, would you be able to add
> > > these lines right above the WARN_ON on line 487 and sharing what it
> > > prints out?
> > >
> > > +++ b/fs/iomap/buffered-io.c
> > > @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio,
> > > size_t bytes_submitted)
> > > * to the IO helper, in which case we are responsible for
> > > * unlocking the folio here.
> > > */
> > > + if (bytes_submitted) {
> > > + struct inode *inode = folio->mapping->host;
> > > + struct block_device *bdev = inode->i_sb->s_bdev;
> > > +
> > > + pr_warn("bytes_submitted=%zu folio_size=%zu
> > > blkbits=%u isize=%lld "
> > > + "logical_bs=%u physical_bs=%u\n",
> > > + bytes_submitted, folio_size(folio),
> > > inode->i_blkbits,
> > > + i_size_read(inode),
> > > + bdev ? bdev_logical_block_size(bdev) : 0,
> > > + bdev ? bdev_physical_block_size(bdev) : 0);
> > > + }
> > > WARN_ON_ONCE(bytes_submitted);
> > > folio_unlock(folio);
> > > }
> >
> > Here we go:
> >
> > [ 17.872952] bytes_submitted=1024 folio_size=4096 blkbits=12
> > isize=5278880768 logical_bs=0 physical_bs=0
>
> Thank you, this is very helpful.
>
> Is it correct that the block device's inode->i_blkbits doesn't reflect
> its actual logical block size (512) or is that itself a bug somewhere
> in the block layer?
For a bdev, i_blkbits is only loosely tied to the storage device's block
size. i_blkbits can't be smaller than log2(lba_size) and it can't be
larger than PAGE_SHIFT (or the max folio order if LBS is present).
Typically it starts out matching PAGE_SIZE, but a program that opens the
block device can call BLKBSZSET to set the block size. This is usually
done by userspace filesystem tools to improve performance and/or ensure
that the kernel actually supports that block size.
Note that kernel filesystem drivers also call sb_set_blocksize to set
the block size.
The bdev block size seemingly resets to 4k after closing.
> On the iomap side, it uses inode->i_blkbits to
> determine whether or not an ifs should be attached and what logic to
> correspondingly call. For this case, shouldn't the inode's i_blkbits
> be 9 since "isize= 5278880768" indicates it's 512-byte aligned, not
> 4096-byte aligned? I'm not familiar with the block layer, so your
> thoughts on this would be appreciated.
<shrug> Regular files can have a larger i_blkbits than the underlying
storage device, so I don't see why bdevs would be different?
--D
> If it is fine for the block device's inode->i_blkbits to be a
> different value from the logical block size, then I think the iomap
> side needs this fix:
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 867e8ac761c8..03a97361570f 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> * helper, then the helper owns the folio and will end
> * the read on it.
> */
> - if (*bytes_submitted == folio_len)
> + if (*bytes_submitted == folio_len || !ifs)
> ctx->cur_folio = NULL;
> }
>
> Could you verify that this fixes the hang?
>
> Thanks,
> Joanne
> >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
2026-03-10 21:55 ` Joanne Koong
@ 2026-03-17 9:34 ` Johannes Thumshirn
2026-03-17 19:48 ` Joanne Koong
0 siblings, 1 reply; 9+ messages in thread
From: Johannes Thumshirn @ 2026-03-17 9:34 UTC (permalink / raw)
To: joannelkoong; +Cc: linux-fsdevel@vger.kernel.org
On 3/10/26 10:55 PM, Joanne Koong wrote:
> On Tue, Mar 10, 2026 at 2:08 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>> On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn
>> <Johannes.Thumshirn@wdc.com> wrote:
>>
>> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
>> index 867e8ac761c8..03a97361570f 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
>> * helper, then the helper owns the folio and will end
>> * the read on it.
>> */
>> - if (*bytes_submitted == folio_len)
>> + if (*bytes_submitted == folio_len || !ifs)
>> ctx->cur_folio = NULL;
>> }
> it should be this instead:
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 867e8ac761c8..b803f518adaf 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -497,6 +497,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> loff_t length = iomap_length(iter);
> struct folio *folio = ctx->cur_folio;
> size_t folio_len = folio_size(folio);
> + struct iomap_folio_state *ifs;
> size_t poff, plen;
> loff_t pos_diff;
> int ret;
> @@ -508,7 +509,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> return iomap_iter_advance(iter, length);
> }
>
> - ifs_alloc(iter->inode, folio, iter->flags);
> + ifs = ifs_alloc(iter->inode, folio, iter->flags);
>
> length = min_t(loff_t, length, folio_len - offset_in_folio(folio, pos));
> while (length) {
> @@ -543,7 +544,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> * helper, then the helper owns the folio and will end
> * the read on it.
> */
> - if (*bytes_submitted == folio_len)
> + if (*bytes_submitted == folio_len || !ifs)
> ctx->cur_folio = NULL;
> }
>
> Thanks,
> Joanne
>
This version is making the test pass again,
Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Thanks a lot for this!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
2026-03-10 21:59 ` Darrick J. Wong
@ 2026-03-17 19:47 ` Joanne Koong
0 siblings, 0 replies; 9+ messages in thread
From: Joanne Koong @ 2026-03-17 19:47 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Johannes Thumshirn, linux-fsdevel@vger.kernel.org
On Tue, Mar 10, 2026 at 2:59 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Tue, Mar 10, 2026 at 02:08:28PM -0700, Joanne Koong wrote:
> > On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn
> > <Johannes.Thumshirn@wdc.com> wrote:
> > >
> > >
> > > >
> > > > 3. If you're able to repro this consistently, would you be able to add
> > > > these lines right above the WARN_ON on line 487 and sharing what it
> > > > prints out?
> > > >
> > > > +++ b/fs/iomap/buffered-io.c
> > > > @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio,
> > > > size_t bytes_submitted)
> > > > * to the IO helper, in which case we are responsible for
> > > > * unlocking the folio here.
> > > > */
> > > > + if (bytes_submitted) {
> > > > + struct inode *inode = folio->mapping->host;
> > > > + struct block_device *bdev = inode->i_sb->s_bdev;
> > > > +
> > > > + pr_warn("bytes_submitted=%zu folio_size=%zu
> > > > blkbits=%u isize=%lld "
> > > > + "logical_bs=%u physical_bs=%u\n",
> > > > + bytes_submitted, folio_size(folio),
> > > > inode->i_blkbits,
> > > > + i_size_read(inode),
> > > > + bdev ? bdev_logical_block_size(bdev) : 0,
> > > > + bdev ? bdev_physical_block_size(bdev) : 0);
> > > > + }
> > > > WARN_ON_ONCE(bytes_submitted);
> > > > folio_unlock(folio);
> > > > }
> > >
> > > Here we go:
> > >
> > > [ 17.872952] bytes_submitted=1024 folio_size=4096 blkbits=12
> > > isize=5278880768 logical_bs=0 physical_bs=0
> >
> > Thank you, this is very helpful.
> >
> > Is it correct that the block device's inode->i_blkbits doesn't reflect
> > its actual logical block size (512) or is that itself a bug somewhere
> > in the block layer?
>
> For a bdev, i_blkbits is only loosely tied to the storage device's block
> size. i_blkbits can't be smaller than log2(lba_size) and it can't be
> larger than PAGE_SHIFT (or the max folio order if LBS is present).
>
> Typically it starts out matching PAGE_SIZE, but a program that opens the
> block device can call BLKBSZSET to set the block size. This is usually
> done by userspace filesystem tools to improve performance and/or ensure
> that the kernel actually supports that block size.
>
> Note that kernel filesystem drivers also call sb_set_blocksize to set
> the block size.
>
> The bdev block size seemingly resets to 4k after closing.
Thanks for the explanation. I hadn't realized inode->i_blkbits is only
loosely tied to the actual I/O granularity.
Thanks,
Joanne
>
> > On the iomap side, it uses inode->i_blkbits to
> > determine whether or not an ifs should be attached and what logic to
> > correspondingly call. For this case, shouldn't the inode's i_blkbits
> > be 9 since "isize= 5278880768" indicates it's 512-byte aligned, not
> > 4096-byte aligned? I'm not familiar with the block layer, so your
> > thoughts on this would be appreciated.
>
> <shrug> Regular files can have a larger i_blkbits than the underlying
> storage device, so I don't see why bdevs would be different?
>
> --D
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
2026-03-17 9:34 ` Johannes Thumshirn
@ 2026-03-17 19:48 ` Joanne Koong
0 siblings, 0 replies; 9+ messages in thread
From: Joanne Koong @ 2026-03-17 19:48 UTC (permalink / raw)
To: Johannes Thumshirn; +Cc: linux-fsdevel@vger.kernel.org
On Tue, Mar 17, 2026 at 2:35 AM Johannes Thumshirn
<Johannes.Thumshirn@wdc.com> wrote:
>
> On 3/10/26 10:55 PM, Joanne Koong wrote:
> > On Tue, Mar 10, 2026 at 2:08 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> >> On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn
> >> <Johannes.Thumshirn@wdc.com> wrote:
> >>
> >> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> >> index 867e8ac761c8..03a97361570f 100644
> >> --- a/fs/iomap/buffered-io.c
> >> +++ b/fs/iomap/buffered-io.c
> >> @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> >> * helper, then the helper owns the folio and will end
> >> * the read on it.
> >> */
> >> - if (*bytes_submitted == folio_len)
> >> + if (*bytes_submitted == folio_len || !ifs)
> >> ctx->cur_folio = NULL;
> >> }
> > it should be this instead:
> >
> > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> > index 867e8ac761c8..b803f518adaf 100644
> > --- a/fs/iomap/buffered-io.c
> > +++ b/fs/iomap/buffered-io.c
> > @@ -497,6 +497,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> > loff_t length = iomap_length(iter);
> > struct folio *folio = ctx->cur_folio;
> > size_t folio_len = folio_size(folio);
> > + struct iomap_folio_state *ifs;
> > size_t poff, plen;
> > loff_t pos_diff;
> > int ret;
> > @@ -508,7 +509,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> > return iomap_iter_advance(iter, length);
> > }
> >
> > - ifs_alloc(iter->inode, folio, iter->flags);
> > + ifs = ifs_alloc(iter->inode, folio, iter->flags);
> >
> > length = min_t(loff_t, length, folio_len - offset_in_folio(folio, pos));
> > while (length) {
> > @@ -543,7 +544,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
> > * helper, then the helper owns the folio and will end
> > * the read on it.
> > */
> > - if (*bytes_submitted == folio_len)
> > + if (*bytes_submitted == folio_len || !ifs)
> > ctx->cur_folio = NULL;
> > }
> >
> > Thanks,
> > Joanne
> >
> This version is making the test pass again,
>
> Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>
> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>
> Thanks a lot for this!
>
Thanks for reporting this and testing / helping to debug the issue.
I'll submit the fix for this upstream today.
Thanks,
Joanne
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-17 19:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09 16:37 Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") Johannes Thumshirn
2026-03-09 20:56 ` Joanne Koong
2026-03-10 11:44 ` Johannes Thumshirn
2026-03-10 21:08 ` Joanne Koong
2026-03-10 21:55 ` Joanne Koong
2026-03-17 9:34 ` Johannes Thumshirn
2026-03-17 19:48 ` Joanne Koong
2026-03-10 21:59 ` Darrick J. Wong
2026-03-17 19:47 ` Joanne Koong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox