* Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()")
@ 2026-03-09 16:37 Johannes Thumshirn
2026-03-09 20:56 ` Joanne Koong
0 siblings, 1 reply; 9+ messages in thread
From: Johannes Thumshirn @ 2026-03-09 16:37 UTC (permalink / raw)
To: Joanne Koong; +Cc: linux-fsdevel@vger.kernel.org
Hi Joanne,
After commit aa35dd5cbc06 ("iomap: fix invalid folio access after
folio_end_read()") my zoned btrfs test setup hangs. I've bisected it to
this commit and reverting fixes my problem. The last thing I see in
dmesg is:
[ 9.387175] ------------[ cut here ]------------
[ 9.387320] WARNING: fs/iomap/buffered-io.c:487 at
iomap_read_end+0x11c/0x140, CPU#5: (udev-worker)/463
[ 9.387431] Modules linked in:
[ 9.387502] CPU: 5 UID: 0 PID: 463 Comm: (udev-worker) Not tainted
6.19.0-rc1+ #385 PREEMPT(full)
[ 9.387626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.17.0-9.fc43 06/10/2025
[ 9.387810] RIP: 0010:iomap_read_end+0x11c/0x140
[ 9.387886] Code: 00 48 89 ef 48 3b 04 24 0f 94 04 24 44 0f b6 34 24
e8 b8 88 69 00 48 89 df 48 83 c4 08 41 0f b6 f6 5b 5d 41 5e e9 54 e7 e8
ff <0f> 0b e9 48 ff ff ff ba 00 10 00 00 eb 9d 0f 0b e9 53 ff ff ff 0f
[ 9.388096] RSP: 0018:ffffc90000e279c0 EFLAGS: 00010206
[ 9.388178] RAX: ffff888111c6b800 RBX: ffffea0004363900 RCX:
0000000000000000
[ 9.388281] RDX: 0000000000000000 RSI: 0000000000000400 RDI:
ffffea0004363900
[ 9.388386] RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000001
[ 9.388491] R10: ffffc90000e279a0 R11: ffff888111c6c168 R12:
ffffffff81e5cdb0
[ 9.388585] R13: ffffc90000e27c58 R14: ffffea0004363900 R15:
ffffc90000e27c58
[ 9.388690] FS: 00007f6e03fdfc00(0000) GS:ffff8882b4c13000(0000)
knlGS:0000000000000000
[ 9.388793] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9.388873] CR2: 00007f6e03280000 CR3: 00000001106d7006 CR4:
0000000000770eb0
[ 9.388967] PKRU: 55555554
[ 9.389006] Call Trace:
[ 9.389044] <TASK>
[ 9.389087] iomap_readahead+0x23c/0x2e0
[ 9.389167] blkdev_readahead+0x3d/0x50
[ 9.389222] read_pages+0x56/0x200
[ 9.389277] ? __folio_batch_add_and_move+0x1cf/0x2d0
[ 9.389354] page_cache_ra_unbounded+0x1db/0x2c0
[ 9.389423] force_page_cache_ra+0x96/0xb0
[ 9.389470] filemap_get_pages+0x12f/0x490
[ 9.389532] filemap_read+0xed/0x400
[ 9.389590] ? lock_acquire+0xd5/0x2b0
[ 9.389633] ? blkdev_read_iter+0x6b/0x180
[ 9.389678] ? lock_acquire+0xe5/0x2b0
[ 9.389720] ? lock_is_held_type+0xcd/0x130
[ 9.389761] ? find_held_lock+0x2b/0x80
[ 9.389813] ? lock_acquired+0x1e9/0x3c0
[ 9.389864] blkdev_read_iter+0x79/0x180
[ 9.389911] ? local_clock_noinstr+0x17/0x110
[ 9.389975] vfs_read+0x240/0x340
[ 9.390033] ksys_read+0x61/0xd0
[ 9.390083] do_syscall_64+0x74/0x3a0
[ 9.390143] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 9.390204] RIP: 0033:0x7f6e048c5c5e
[ 9.390255] Code: 4d 89 d8 e8 34 bd 00 00 4c 8b 5d f8 41 8b 93 08 03
00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f
05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa
[ 9.390447] RSP: 002b:00007ffdf50cded0 EFLAGS: 00000202 ORIG_RAX:
0000000000000000
[ 9.390529] RAX: ffffffffffffffda RBX: 000000013aa55200 RCX:
00007f6e048c5c5e
[ 9.390619] RDX: 0000000000000200 RSI: 00007f6e0327f000 RDI:
0000000000000014
[ 9.390704] RBP: 00007ffdf50cdee0 R08: 0000000000000000 R09:
0000000000000000
[ 9.390789] R10: 0000000000000000 R11: 0000000000000202 R12:
0000000000000000
[ 9.390866] R13: 000055d5f99fa270 R14: 000055d5f96e5cb0 R15:
000055d5f96e5cc8
[ 9.390967] </TASK>
[ 9.390997] irq event stamp: 54269
[ 9.391044] hardirqs last enabled at (54279): [<ffffffff8138ac02>]
__up_console_sem+0x52/0x60
[ 9.391141] hardirqs last disabled at (54288): [<ffffffff8138abe7>]
__up_console_sem+0x37/0x60
[ 9.391240] softirqs last enabled at (53796): [<ffffffff81304768>]
irq_exit_rcu+0x78/0x110
[ 9.391327] softirqs last disabled at (53787): [<ffffffff81304768>]
irq_exit_rcu+0x78/0x110
[ 9.391420] ---[ end trace 0000000000000000 ]---
I haven't debugged this further yet, maybe you have an idea what
could've caused it. On my side it's trivial to reproduce, so if you
can't reproduce it just yell.
Byte,
Johannes
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") 2026-03-09 16:37 Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") Johannes Thumshirn @ 2026-03-09 20:56 ` Joanne Koong 2026-03-10 11:44 ` Johannes Thumshirn 0 siblings, 1 reply; 9+ messages in thread From: Joanne Koong @ 2026-03-09 20:56 UTC (permalink / raw) To: Johannes Thumshirn; +Cc: linux-fsdevel@vger.kernel.org On Mon, Mar 9, 2026 at 9:37 AM Johannes Thumshirn <Johannes.Thumshirn@wdc.com> wrote: > > Hi Joanne, > > After commit aa35dd5cbc06 ("iomap: fix invalid folio access after > folio_end_read()") my zoned btrfs test setup hangs. I've bisected it to > this commit and reverting fixes my problem. The last thing I see in > dmesg is: > > [ 9.387175] ------------[ cut here ]------------ > [ 9.387320] WARNING: fs/iomap/buffered-io.c:487 at > iomap_read_end+0x11c/0x140, CPU#5: (udev-worker)/463 > [ 9.387431] Modules linked in: > [ 9.387502] CPU: 5 UID: 0 PID: 463 Comm: (udev-worker) Not tainted > 6.19.0-rc1+ #385 PREEMPT(full) > [ 9.387626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.17.0-9.fc43 06/10/2025 > [ 9.387810] RIP: 0010:iomap_read_end+0x11c/0x140 > [ 9.387886] Code: 00 48 89 ef 48 3b 04 24 0f 94 04 24 44 0f b6 34 24 > e8 b8 88 69 00 48 89 df 48 83 c4 08 41 0f b6 f6 5b 5d 41 5e e9 54 e7 e8 > ff <0f> 0b e9 48 ff ff ff ba 00 10 00 00 eb 9d 0f 0b e9 53 ff ff ff 0f > [ 9.388096] RSP: 0018:ffffc90000e279c0 EFLAGS: 00010206 > [ 9.388178] RAX: ffff888111c6b800 RBX: ffffea0004363900 RCX: > 0000000000000000 > [ 9.388281] RDX: 0000000000000000 RSI: 0000000000000400 RDI: > ffffea0004363900 > [ 9.388386] RBP: 0000000000000000 R08: 0000000000000000 R09: > 0000000000000001 > [ 9.388491] R10: ffffc90000e279a0 R11: ffff888111c6c168 R12: > ffffffff81e5cdb0 > [ 9.388585] R13: ffffc90000e27c58 R14: ffffea0004363900 R15: > ffffc90000e27c58 > [ 9.388690] FS: 00007f6e03fdfc00(0000) GS:ffff8882b4c13000(0000) > knlGS:0000000000000000 > [ 9.388793] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 9.388873] CR2: 00007f6e03280000 CR3: 00000001106d7006 CR4: > 0000000000770eb0 > [ 9.388967] PKRU: 55555554 > [ 9.389006] Call Trace: > [ 9.389044] <TASK> > [ 9.389087] iomap_readahead+0x23c/0x2e0 > [ 9.389167] blkdev_readahead+0x3d/0x50 > [ 9.389222] read_pages+0x56/0x200 > [ 9.389277] ? __folio_batch_add_and_move+0x1cf/0x2d0 > [ 9.389354] page_cache_ra_unbounded+0x1db/0x2c0 > [ 9.389423] force_page_cache_ra+0x96/0xb0 > [ 9.389470] filemap_get_pages+0x12f/0x490 > [ 9.389532] filemap_read+0xed/0x400 > [ 9.389590] ? lock_acquire+0xd5/0x2b0 > [ 9.389633] ? blkdev_read_iter+0x6b/0x180 > [ 9.389678] ? lock_acquire+0xe5/0x2b0 > [ 9.389720] ? lock_is_held_type+0xcd/0x130 > [ 9.389761] ? find_held_lock+0x2b/0x80 > [ 9.389813] ? lock_acquired+0x1e9/0x3c0 > [ 9.389864] blkdev_read_iter+0x79/0x180 > [ 9.389911] ? local_clock_noinstr+0x17/0x110 > [ 9.389975] vfs_read+0x240/0x340 > [ 9.390033] ksys_read+0x61/0xd0 > [ 9.390083] do_syscall_64+0x74/0x3a0 > [ 9.390143] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 9.390204] RIP: 0033:0x7f6e048c5c5e > [ 9.390255] Code: 4d 89 d8 e8 34 bd 00 00 4c 8b 5d f8 41 8b 93 08 03 > 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f > 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa > [ 9.390447] RSP: 002b:00007ffdf50cded0 EFLAGS: 00000202 ORIG_RAX: > 0000000000000000 > [ 9.390529] RAX: ffffffffffffffda RBX: 000000013aa55200 RCX: > 00007f6e048c5c5e > [ 9.390619] RDX: 0000000000000200 RSI: 00007f6e0327f000 RDI: > 0000000000000014 > [ 9.390704] RBP: 00007ffdf50cdee0 R08: 0000000000000000 R09: > 0000000000000000 > [ 9.390789] R10: 0000000000000000 R11: 0000000000000202 R12: > 0000000000000000 > [ 9.390866] R13: 000055d5f99fa270 R14: 000055d5f96e5cb0 R15: > 000055d5f96e5cc8 > [ 9.390967] </TASK> > [ 9.390997] irq event stamp: 54269 > [ 9.391044] hardirqs last enabled at (54279): [<ffffffff8138ac02>] > __up_console_sem+0x52/0x60 > [ 9.391141] hardirqs last disabled at (54288): [<ffffffff8138abe7>] > __up_console_sem+0x37/0x60 > [ 9.391240] softirqs last enabled at (53796): [<ffffffff81304768>] > irq_exit_rcu+0x78/0x110 > [ 9.391327] softirqs last disabled at (53787): [<ffffffff81304768>] > irq_exit_rcu+0x78/0x110 > [ 9.391420] ---[ end trace 0000000000000000 ]--- > > I haven't debugged this further yet, maybe you have an idea what > could've caused it. On my side it's trivial to reproduce, so if you > can't reproduce it just yell. Hi Johannes, Thanks for your report and for bisecting this. A few questions: 1. From the stack trace it looks like this is happening during a block device read triggered by udev-worker's device probing. Does this trigger for you consistently when the zoned device is probed or only when running the btrfs xfstests generic/648? 2. I tried to repro your setup by running: sudo modprobe null_blk nr_devices=2 zoned=1 zone_size=256 zone_nr_conv=8 memory_backed=1 sudo mkdir -p /mnt/test && sudo mkdir -p /mnt/scratch sudo mkfs.btrfs -f /dev/nullb0 sudo mount /dev/nullb0 /mnt/test sudo ./check generic/648 with local.config set to: TEST_DEV=/dev/nullb0 TEST_DIR=/mnt/test SCRATCH_DEV=/dev/nullb1 SCRATCH_MNT=/mnt/scratch export FSTYP=btrfs but I’m not seeing the hang or the WARNING in dmesg show up. I am running it with PREEMPT(full) enabled. Does this match your setup or are you doing something differently? Are you using null_blk or an actual zoned device? If you are using null_blk, what module parameters are you using? 3. If you're able to repro this consistently, would you be able to add these lines right above the WARN_ON on line 487 and sharing what it prints out? +++ b/fs/iomap/buffered-io.c @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio, size_t bytes_submitted) * to the IO helper, in which case we are responsible for * unlocking the folio here. */ + if (bytes_submitted) { + struct inode *inode = folio->mapping->host; + struct block_device *bdev = inode->i_sb->s_bdev; + + pr_warn("bytes_submitted=%zu folio_size=%zu blkbits=%u isize=%lld " + "logical_bs=%u physical_bs=%u\n", + bytes_submitted, folio_size(folio), inode->i_blkbits, + i_size_read(inode), + bdev ? bdev_logical_block_size(bdev) : 0, + bdev ? bdev_physical_block_size(bdev) : 0); + } WARN_ON_ONCE(bytes_submitted); folio_unlock(folio); } Thanks, Joanne > > Byte, > > Johannes > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") 2026-03-09 20:56 ` Joanne Koong @ 2026-03-10 11:44 ` Johannes Thumshirn 2026-03-10 21:08 ` Joanne Koong 0 siblings, 1 reply; 9+ messages in thread From: Johannes Thumshirn @ 2026-03-10 11:44 UTC (permalink / raw) To: Joanne Koong; +Cc: linux-fsdevel@vger.kernel.org On 3/9/26 9:56 PM, Joanne Koong wrote: > A few questions: > 1. From the stack trace it looks like this is happening during a > block device read triggered by udev-worker's device probing. Does this > trigger for you consistently when the zoned device is probed or only > when running the btrfs xfstests generic/648? Its only on g/648. I've also tried zoned XFS but g/648 didn't like my XFS and skipped the test case. > 2. I tried to repro your setup by running: > sudo modprobe null_blk nr_devices=2 zoned=1 zone_size=256 > zone_nr_conv=8 memory_backed=1 > sudo mkdir -p /mnt/test && sudo mkdir -p /mnt/scratch > sudo mkfs.btrfs -f /dev/nullb0 > sudo mount /dev/nullb0 /mnt/test > sudo ./check generic/648 > > with local.config set to: > TEST_DEV=/dev/nullb0 > TEST_DIR=/mnt/test > SCRATCH_DEV=/dev/nullb1 > SCRATCH_MNT=/mnt/scratch > export FSTYP=btrfs > > but I’m not seeing the hang or the WARNING in dmesg show up. I am > running it with PREEMPT(full) enabled. Does this match your setup or > are you doing something differently? Are you using null_blk or an > actual zoned device? If you are using null_blk, what module parameters > are you using? I'm using a zloop device passed to a VM via virtio-blk. The zloop configuration is just: echo "add id=0,zone_size_mb=256,conv_zones=4" > /dev/zloop-control echo "add id=1,zone_size_mb=256,conv_zones=4" > /dev/zloop-control The virtio-blk conifg is: -drive driver=host_device,file=/dev/zloop0,if=virtio,cache.direct=on \ -drive driver=host_device,file=/dev/zloop1,if=virtio,cache.direct=on The rootfs is on virtio-fs (using virtme-ng). > > 3. If you're able to repro this consistently, would you be able to add > these lines right above the WARN_ON on line 487 and sharing what it > prints out? > > +++ b/fs/iomap/buffered-io.c > @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio, > size_t bytes_submitted) > * to the IO helper, in which case we are responsible for > * unlocking the folio here. > */ > + if (bytes_submitted) { > + struct inode *inode = folio->mapping->host; > + struct block_device *bdev = inode->i_sb->s_bdev; > + > + pr_warn("bytes_submitted=%zu folio_size=%zu > blkbits=%u isize=%lld " > + "logical_bs=%u physical_bs=%u\n", > + bytes_submitted, folio_size(folio), > inode->i_blkbits, > + i_size_read(inode), > + bdev ? bdev_logical_block_size(bdev) : 0, > + bdev ? bdev_physical_block_size(bdev) : 0); > + } > WARN_ON_ONCE(bytes_submitted); > folio_unlock(folio); > } Here we go: [ 17.872952] bytes_submitted=1024 folio_size=4096 blkbits=12 isize=5278880768 logical_bs=0 physical_bs=0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") 2026-03-10 11:44 ` Johannes Thumshirn @ 2026-03-10 21:08 ` Joanne Koong 2026-03-10 21:55 ` Joanne Koong 2026-03-10 21:59 ` Darrick J. Wong 0 siblings, 2 replies; 9+ messages in thread From: Joanne Koong @ 2026-03-10 21:08 UTC (permalink / raw) To: Johannes Thumshirn; +Cc: linux-fsdevel@vger.kernel.org On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn <Johannes.Thumshirn@wdc.com> wrote: > > > > > > 3. If you're able to repro this consistently, would you be able to add > > these lines right above the WARN_ON on line 487 and sharing what it > > prints out? > > > > +++ b/fs/iomap/buffered-io.c > > @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio, > > size_t bytes_submitted) > > * to the IO helper, in which case we are responsible for > > * unlocking the folio here. > > */ > > + if (bytes_submitted) { > > + struct inode *inode = folio->mapping->host; > > + struct block_device *bdev = inode->i_sb->s_bdev; > > + > > + pr_warn("bytes_submitted=%zu folio_size=%zu > > blkbits=%u isize=%lld " > > + "logical_bs=%u physical_bs=%u\n", > > + bytes_submitted, folio_size(folio), > > inode->i_blkbits, > > + i_size_read(inode), > > + bdev ? bdev_logical_block_size(bdev) : 0, > > + bdev ? bdev_physical_block_size(bdev) : 0); > > + } > > WARN_ON_ONCE(bytes_submitted); > > folio_unlock(folio); > > } > > Here we go: > > [ 17.872952] bytes_submitted=1024 folio_size=4096 blkbits=12 > isize=5278880768 logical_bs=0 physical_bs=0 Thank you, this is very helpful. Is it correct that the block device's inode->i_blkbits doesn't reflect its actual logical block size (512) or is that itself a bug somewhere in the block layer? On the iomap side, it uses inode->i_blkbits to determine whether or not an ifs should be attached and what logic to correspondingly call. For this case, shouldn't the inode's i_blkbits be 9 since "isize= 5278880768" indicates it's 512-byte aligned, not 4096-byte aligned? I'm not familiar with the block layer, so your thoughts on this would be appreciated. If it is fine for the block device's inode->i_blkbits to be a different value from the logical block size, then I think the iomap side needs this fix: diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 867e8ac761c8..03a97361570f 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, * helper, then the helper owns the folio and will end * the read on it. */ - if (*bytes_submitted == folio_len) + if (*bytes_submitted == folio_len || !ifs) ctx->cur_folio = NULL; } Could you verify that this fixes the hang? Thanks, Joanne > ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") 2026-03-10 21:08 ` Joanne Koong @ 2026-03-10 21:55 ` Joanne Koong 2026-03-17 9:34 ` Johannes Thumshirn 2026-03-10 21:59 ` Darrick J. Wong 1 sibling, 1 reply; 9+ messages in thread From: Joanne Koong @ 2026-03-10 21:55 UTC (permalink / raw) To: Johannes Thumshirn; +Cc: linux-fsdevel@vger.kernel.org On Tue, Mar 10, 2026 at 2:08 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn > <Johannes.Thumshirn@wdc.com> wrote: > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 867e8ac761c8..03a97361570f 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > * helper, then the helper owns the folio and will end > * the read on it. > */ > - if (*bytes_submitted == folio_len) > + if (*bytes_submitted == folio_len || !ifs) > ctx->cur_folio = NULL; > } it should be this instead: diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 867e8ac761c8..b803f518adaf 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -497,6 +497,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, loff_t length = iomap_length(iter); struct folio *folio = ctx->cur_folio; size_t folio_len = folio_size(folio); + struct iomap_folio_state *ifs; size_t poff, plen; loff_t pos_diff; int ret; @@ -508,7 +509,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, return iomap_iter_advance(iter, length); } - ifs_alloc(iter->inode, folio, iter->flags); + ifs = ifs_alloc(iter->inode, folio, iter->flags); length = min_t(loff_t, length, folio_len - offset_in_folio(folio, pos)); while (length) { @@ -543,7 +544,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, * helper, then the helper owns the folio and will end * the read on it. */ - if (*bytes_submitted == folio_len) + if (*bytes_submitted == folio_len || !ifs) ctx->cur_folio = NULL; } Thanks, Joanne ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") 2026-03-10 21:55 ` Joanne Koong @ 2026-03-17 9:34 ` Johannes Thumshirn 2026-03-17 19:48 ` Joanne Koong 0 siblings, 1 reply; 9+ messages in thread From: Johannes Thumshirn @ 2026-03-17 9:34 UTC (permalink / raw) To: joannelkoong; +Cc: linux-fsdevel@vger.kernel.org On 3/10/26 10:55 PM, Joanne Koong wrote: > On Tue, Mar 10, 2026 at 2:08 PM Joanne Koong <joannelkoong@gmail.com> wrote: >> On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn >> <Johannes.Thumshirn@wdc.com> wrote: >> >> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c >> index 867e8ac761c8..03a97361570f 100644 >> --- a/fs/iomap/buffered-io.c >> +++ b/fs/iomap/buffered-io.c >> @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, >> * helper, then the helper owns the folio and will end >> * the read on it. >> */ >> - if (*bytes_submitted == folio_len) >> + if (*bytes_submitted == folio_len || !ifs) >> ctx->cur_folio = NULL; >> } > it should be this instead: > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 867e8ac761c8..b803f518adaf 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -497,6 +497,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > loff_t length = iomap_length(iter); > struct folio *folio = ctx->cur_folio; > size_t folio_len = folio_size(folio); > + struct iomap_folio_state *ifs; > size_t poff, plen; > loff_t pos_diff; > int ret; > @@ -508,7 +509,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > return iomap_iter_advance(iter, length); > } > > - ifs_alloc(iter->inode, folio, iter->flags); > + ifs = ifs_alloc(iter->inode, folio, iter->flags); > > length = min_t(loff_t, length, folio_len - offset_in_folio(folio, pos)); > while (length) { > @@ -543,7 +544,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > * helper, then the helper owns the folio and will end > * the read on it. > */ > - if (*bytes_submitted == folio_len) > + if (*bytes_submitted == folio_len || !ifs) > ctx->cur_folio = NULL; > } > > Thanks, > Joanne > This version is making the test pass again, Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Thanks a lot for this! ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") 2026-03-17 9:34 ` Johannes Thumshirn @ 2026-03-17 19:48 ` Joanne Koong 0 siblings, 0 replies; 9+ messages in thread From: Joanne Koong @ 2026-03-17 19:48 UTC (permalink / raw) To: Johannes Thumshirn; +Cc: linux-fsdevel@vger.kernel.org On Tue, Mar 17, 2026 at 2:35 AM Johannes Thumshirn <Johannes.Thumshirn@wdc.com> wrote: > > On 3/10/26 10:55 PM, Joanne Koong wrote: > > On Tue, Mar 10, 2026 at 2:08 PM Joanne Koong <joannelkoong@gmail.com> wrote: > >> On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn > >> <Johannes.Thumshirn@wdc.com> wrote: > >> > >> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > >> index 867e8ac761c8..03a97361570f 100644 > >> --- a/fs/iomap/buffered-io.c > >> +++ b/fs/iomap/buffered-io.c > >> @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > >> * helper, then the helper owns the folio and will end > >> * the read on it. > >> */ > >> - if (*bytes_submitted == folio_len) > >> + if (*bytes_submitted == folio_len || !ifs) > >> ctx->cur_folio = NULL; > >> } > > it should be this instead: > > > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > > index 867e8ac761c8..b803f518adaf 100644 > > --- a/fs/iomap/buffered-io.c > > +++ b/fs/iomap/buffered-io.c > > @@ -497,6 +497,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > > loff_t length = iomap_length(iter); > > struct folio *folio = ctx->cur_folio; > > size_t folio_len = folio_size(folio); > > + struct iomap_folio_state *ifs; > > size_t poff, plen; > > loff_t pos_diff; > > int ret; > > @@ -508,7 +509,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > > return iomap_iter_advance(iter, length); > > } > > > > - ifs_alloc(iter->inode, folio, iter->flags); > > + ifs = ifs_alloc(iter->inode, folio, iter->flags); > > > > length = min_t(loff_t, length, folio_len - offset_in_folio(folio, pos)); > > while (length) { > > @@ -543,7 +544,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > > * helper, then the helper owns the folio and will end > > * the read on it. > > */ > > - if (*bytes_submitted == folio_len) > > + if (*bytes_submitted == folio_len || !ifs) > > ctx->cur_folio = NULL; > > } > > > > Thanks, > > Joanne > > > This version is making the test pass again, > > Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Thanks a lot for this! > Thanks for reporting this and testing / helping to debug the issue. I'll submit the fix for this upstream today. Thanks, Joanne ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") 2026-03-10 21:08 ` Joanne Koong 2026-03-10 21:55 ` Joanne Koong @ 2026-03-10 21:59 ` Darrick J. Wong 2026-03-17 19:47 ` Joanne Koong 1 sibling, 1 reply; 9+ messages in thread From: Darrick J. Wong @ 2026-03-10 21:59 UTC (permalink / raw) To: Joanne Koong; +Cc: Johannes Thumshirn, linux-fsdevel@vger.kernel.org On Tue, Mar 10, 2026 at 02:08:28PM -0700, Joanne Koong wrote: > On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn > <Johannes.Thumshirn@wdc.com> wrote: > > > > > > > > > > 3. If you're able to repro this consistently, would you be able to add > > > these lines right above the WARN_ON on line 487 and sharing what it > > > prints out? > > > > > > +++ b/fs/iomap/buffered-io.c > > > @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio, > > > size_t bytes_submitted) > > > * to the IO helper, in which case we are responsible for > > > * unlocking the folio here. > > > */ > > > + if (bytes_submitted) { > > > + struct inode *inode = folio->mapping->host; > > > + struct block_device *bdev = inode->i_sb->s_bdev; > > > + > > > + pr_warn("bytes_submitted=%zu folio_size=%zu > > > blkbits=%u isize=%lld " > > > + "logical_bs=%u physical_bs=%u\n", > > > + bytes_submitted, folio_size(folio), > > > inode->i_blkbits, > > > + i_size_read(inode), > > > + bdev ? bdev_logical_block_size(bdev) : 0, > > > + bdev ? bdev_physical_block_size(bdev) : 0); > > > + } > > > WARN_ON_ONCE(bytes_submitted); > > > folio_unlock(folio); > > > } > > > > Here we go: > > > > [ 17.872952] bytes_submitted=1024 folio_size=4096 blkbits=12 > > isize=5278880768 logical_bs=0 physical_bs=0 > > Thank you, this is very helpful. > > Is it correct that the block device's inode->i_blkbits doesn't reflect > its actual logical block size (512) or is that itself a bug somewhere > in the block layer? For a bdev, i_blkbits is only loosely tied to the storage device's block size. i_blkbits can't be smaller than log2(lba_size) and it can't be larger than PAGE_SHIFT (or the max folio order if LBS is present). Typically it starts out matching PAGE_SIZE, but a program that opens the block device can call BLKBSZSET to set the block size. This is usually done by userspace filesystem tools to improve performance and/or ensure that the kernel actually supports that block size. Note that kernel filesystem drivers also call sb_set_blocksize to set the block size. The bdev block size seemingly resets to 4k after closing. > On the iomap side, it uses inode->i_blkbits to > determine whether or not an ifs should be attached and what logic to > correspondingly call. For this case, shouldn't the inode's i_blkbits > be 9 since "isize= 5278880768" indicates it's 512-byte aligned, not > 4096-byte aligned? I'm not familiar with the block layer, so your > thoughts on this would be appreciated. <shrug> Regular files can have a larger i_blkbits than the underlying storage device, so I don't see why bdevs would be different? --D > If it is fine for the block device's inode->i_blkbits to be a > different value from the logical block size, then I think the iomap > side needs this fix: > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 867e8ac761c8..03a97361570f 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -543,7 +543,7 @@ static int iomap_read_folio_iter(struct iomap_iter *iter, > * helper, then the helper owns the folio and will end > * the read on it. > */ > - if (*bytes_submitted == folio_len) > + if (*bytes_submitted == folio_len || !ifs) > ctx->cur_folio = NULL; > } > > Could you verify that this fixes the hang? > > Thanks, > Joanne > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") 2026-03-10 21:59 ` Darrick J. Wong @ 2026-03-17 19:47 ` Joanne Koong 0 siblings, 0 replies; 9+ messages in thread From: Joanne Koong @ 2026-03-17 19:47 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Johannes Thumshirn, linux-fsdevel@vger.kernel.org On Tue, Mar 10, 2026 at 2:59 PM Darrick J. Wong <djwong@kernel.org> wrote: > > On Tue, Mar 10, 2026 at 02:08:28PM -0700, Joanne Koong wrote: > > On Tue, Mar 10, 2026 at 4:44 AM Johannes Thumshirn > > <Johannes.Thumshirn@wdc.com> wrote: > > > > > > > > > > > > > > 3. If you're able to repro this consistently, would you be able to add > > > > these lines right above the WARN_ON on line 487 and sharing what it > > > > prints out? > > > > > > > > +++ b/fs/iomap/buffered-io.c > > > > @@ -484,6 +484,17 @@ static void iomap_read_end(struct folio *folio, > > > > size_t bytes_submitted) > > > > * to the IO helper, in which case we are responsible for > > > > * unlocking the folio here. > > > > */ > > > > + if (bytes_submitted) { > > > > + struct inode *inode = folio->mapping->host; > > > > + struct block_device *bdev = inode->i_sb->s_bdev; > > > > + > > > > + pr_warn("bytes_submitted=%zu folio_size=%zu > > > > blkbits=%u isize=%lld " > > > > + "logical_bs=%u physical_bs=%u\n", > > > > + bytes_submitted, folio_size(folio), > > > > inode->i_blkbits, > > > > + i_size_read(inode), > > > > + bdev ? bdev_logical_block_size(bdev) : 0, > > > > + bdev ? bdev_physical_block_size(bdev) : 0); > > > > + } > > > > WARN_ON_ONCE(bytes_submitted); > > > > folio_unlock(folio); > > > > } > > > > > > Here we go: > > > > > > [ 17.872952] bytes_submitted=1024 folio_size=4096 blkbits=12 > > > isize=5278880768 logical_bs=0 physical_bs=0 > > > > Thank you, this is very helpful. > > > > Is it correct that the block device's inode->i_blkbits doesn't reflect > > its actual logical block size (512) or is that itself a bug somewhere > > in the block layer? > > For a bdev, i_blkbits is only loosely tied to the storage device's block > size. i_blkbits can't be smaller than log2(lba_size) and it can't be > larger than PAGE_SHIFT (or the max folio order if LBS is present). > > Typically it starts out matching PAGE_SIZE, but a program that opens the > block device can call BLKBSZSET to set the block size. This is usually > done by userspace filesystem tools to improve performance and/or ensure > that the kernel actually supports that block size. > > Note that kernel filesystem drivers also call sb_set_blocksize to set > the block size. > > The bdev block size seemingly resets to 4k after closing. Thanks for the explanation. I hadn't realized inode->i_blkbits is only loosely tied to the actual I/O granularity. Thanks, Joanne > > > On the iomap side, it uses inode->i_blkbits to > > determine whether or not an ifs should be attached and what logic to > > correspondingly call. For this case, shouldn't the inode's i_blkbits > > be 9 since "isize= 5278880768" indicates it's 512-byte aligned, not > > 4096-byte aligned? I'm not familiar with the block layer, so your > > thoughts on this would be appreciated. > > <shrug> Regular files can have a larger i_blkbits than the underlying > storage device, so I don't see why bdevs would be different? > > --D > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-03-17 19:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09 16:37 Hang in generic/648 on zoned btrfs after aa35dd5cbc06 ("iomap: fix invalid folio access after folio_end_read()") Johannes Thumshirn
2026-03-09 20:56 ` Joanne Koong
2026-03-10 11:44 ` Johannes Thumshirn
2026-03-10 21:08 ` Joanne Koong
2026-03-10 21:55 ` Joanne Koong
2026-03-17 9:34 ` Johannes Thumshirn
2026-03-17 19:48 ` Joanne Koong
2026-03-10 21:59 ` Darrick J. Wong
2026-03-17 19:47 ` Joanne Koong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox