* [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
@ 2026-01-07 16:39 Yi Zhang
2026-01-07 16:48 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Yi Zhang @ 2026-01-07 16:39 UTC (permalink / raw)
To: linux-block; +Cc: Jens Axboe, Ming Lei, Shinichiro Kawasaki
Hi
The following issue[2] was triggered by blktests nvme/059 and it's
100% reproduced with commit[1]. Please help check it and let me know
if you need any info/test for it.
Seems it's one regression, I will try to test with the latest
linux-block/for-next and also bisect it tomorrow.
[1]
commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
Merge: 29cefd61e0c6 fcf463b92a08
Author: Jens Axboe <axboe@kernel.dk>
Date: Tue Jan 6 05:48:07 2026 -0700
Merge branch 'for-7.0/blk-pvec' into for-next
* for-7.0/blk-pvec:
types: move phys_vec definition to common header
nvme-pci: Use size_t for length fields to handle larger sizes
[2]
[16866.579229] run blktests nvme/049 at 2026-01-07 02:00:14
[16869.709147] slab io_kiocb start ffff88825e6ad400 pointer offset 144 size 248
[16869.716399] list_add corruption. prev->next should be next
(ffff888200596100), but was 0000000000000000. (prev=ffff88825e6ad490).
[16869.728106] ------------[ cut here ]------------
[16869.732738] kernel BUG at lib/list_debug.c:32!
[16869.737209] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
[16869.742790] CPU: 15 UID: 0 PID: 71799 Comm: fio Kdump: loaded Not
tainted 6.19.0-rc3+ #1 PREEMPT(voluntary)
[16869.752614] Hardware name: Dell Inc. PowerEdge R6515/07PXPY, BIOS
2.21.1 09/24/2025
[16869.760267] RIP: 0010:__list_add_valid_or_report+0xf9/0x130
[16869.765849] Code: 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c
02 00 75 3c 49 8b 55 00 4c 89 e9 48 89 de 48 c7 c7 40 6d f6 9e e8 67
e1 a1 fe <0f> 0b 4c 89 e7 e8 8d eb 78 ff e9 3c ff ff ff 4c 89 ef e8 80
eb 78
[16869.784600] RSP: 0018:ffffc9000aadf990 EFLAGS: 00010282
[16869.789835] RAX: 0000000000000075 RBX: ffff888200596100 RCX: 0000000000000000
[16869.796967] RDX: 0000000000000075 RSI: ffffffff9ef66980 RDI: fffff5200155bf24
[16869.804101] RBP: ffff88825e6adc10 R08: 0000000000000001 R09: fffff5200155bee6
[16869.811234] R10: ffffc9000aadf737 R11: 0000000000000001 R12: ffff888200596108
[16869.818366] R13: ffff88825e6ad490 R14: ffff88825e6ad490 R15: ffff88825e6adc10
[16869.825500] FS: 00007f01a51bb740(0000) GS:ffff88887f6c4000(0000)
knlGS:0000000000000000
[16869.833591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[16869.839338] CR2: 00007f019cdb7430 CR3: 00000001eeeae000 CR4: 0000000000350ef0
[16869.846469] Call Trace:
[16869.848923] <TASK>
[16869.851034] io_issue_sqe+0x7eb/0xdd0
[16869.854707] ? srso_return_thunk+0x5/0x5f
[16869.858725] ? io_uring_cmd_prep+0x350/0x560
[16869.863012] io_submit_sqes+0x475/0x1000
[16869.866942] ? srso_return_thunk+0x5/0x5f
[16869.870969] ? __pfx_io_submit_sqes+0x10/0x10
[16869.875332] ? srso_return_thunk+0x5/0x5f
[16869.879352] ? __fget_files+0x1b6/0x2f0
[16869.883208] __do_sys_io_uring_enter+0x433/0x820
[16869.887829] ? fput+0x4c/0xa0
[16869.890809] ? __pfx___do_sys_io_uring_enter+0x10/0x10
[16869.895958] ? srso_return_thunk+0x5/0x5f
[16869.899978] ? srso_return_thunk+0x5/0x5f
[16869.903999] ? rcu_is_watching+0x15/0xb0
[16869.907934] ? srso_return_thunk+0x5/0x5f
[16869.911953] ? trace_irq_enable.constprop.0+0x13d/0x190
[16869.917183] ? srso_return_thunk+0x5/0x5f
[16869.921203] ? syscall_trace_enter+0x13e/0x230
[16869.925656] ? srso_return_thunk+0x5/0x5f
[16869.929685] do_syscall_64+0x95/0x520
[16869.933363] ? srso_return_thunk+0x5/0x5f
[16869.937380] ? trace_irq_enable.constprop.0+0x13d/0x190
[16869.942608] ? srso_return_thunk+0x5/0x5f
[16869.946628] ? do_syscall_64+0x16d/0x520
[16869.950556] ? __pfx_pgd_none+0x10/0x10
[16869.954408] ? srso_return_thunk+0x5/0x5f
[16869.958424] ? __handle_mm_fault+0x97e/0x11d0
[16869.962795] ? __pfx_css_rstat_updated+0x10/0x10
[16869.967421] ? __pfx___handle_mm_fault+0x10/0x10
[16869.972050] ? srso_return_thunk+0x5/0x5f
[16869.976069] ? rcu_is_watching+0x15/0xb0
[16869.979995] ? srso_return_thunk+0x5/0x5f
[16869.984016] ? trace_count_memcg_events+0x14f/0x1a0
[16869.988905] ? srso_return_thunk+0x5/0x5f
[16869.992924] ? count_memcg_events+0xe5/0x370
[16869.997198] ? srso_return_thunk+0x5/0x5f
[16870.001218] ? srso_return_thunk+0x5/0x5f
[16870.005232] ? __up_read+0x2c5/0x700
[16870.008821] ? __pfx___up_read+0x10/0x10
[16870.012756] ? handle_mm_fault+0x452/0x8a0
[16870.016862] ? do_user_addr_fault+0x274/0xa60
[16870.021229] ? srso_return_thunk+0x5/0x5f
[16870.025241] ? rcu_is_watching+0x15/0xb0
[16870.029172] ? srso_return_thunk+0x5/0x5f
[16870.033189] ? rcu_is_watching+0x15/0xb0
[16870.037114] ? srso_return_thunk+0x5/0x5f
[16870.041126] ? trace_irq_enable.constprop.0+0x13d/0x190
[16870.046353] ? srso_return_thunk+0x5/0x5f
[16870.050368] ? srso_return_thunk+0x5/0x5f
[16870.054387] ? irqentry_exit+0x93/0x5f0
[16870.058229] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[16870.063288] RIP: 0033:0x558b6250d067
[16870.066876] Code: 24 94 00 00 00 85 f6 74 78 49 8b 44 24 20 41 8b
3c 24 45 31 c0 45 31 c9 41 ba 01 00 00 00 31 d2 44 8b 38 b8 aa 01 00
00 0f 05 <48> 89 c3 89 c5 85 c0 7e 90 89 c2 44 89 fe 4c 89 ef e8 c3 d6
ff ff
[16870.085630] RSP: 002b:00007ffc479b70b0 EFLAGS: 00000246 ORIG_RAX:
00000000000001aa
[16870.093205] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000558b6250d067
[16870.100335] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000007
[16870.107468] RBP: 00007f019cdb6000 R08: 0000000000000000 R09: 0000000000000000
[16870.114601] R10: 0000000000000001 R11: 0000000000000246 R12: 0000558b9f3eab00
[16870.121734] R13: 00007f019cdb6000 R14: 0000558b62527000 R15: 0000000000000001
[16870.128882] </TASK>
[16870.131077] Modules linked in: ext4 crc16 mbcache jbd2
rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace
nfs_localio netfs platform_profile dell_wmi dell_smbios intel_rapl_msr
amd_atl intel_rapl_common sparse_keymap amd64_edac rfkill edac_mce_amd
video vfat dcdbas fat kvm_amd cdc_ether usbnet kvm mii irqbypass
mgag200 wmi_bmof dell_wmi_descriptor rapl i2c_algo_bit pcspkr
acpi_cpufreq ipmi_ssif ptdma i2c_piix4 k10temp i2c_smbus
acpi_power_meter ipmi_si acpi_ipmi ipmi_devintf ipmi_msghandler sg
loop fuse xfs sd_mod nvme ahci libahci nvme_core mpt3sas
ghash_clmulni_intel tg3 nvme_keyring ccp libata raid_class nvme_auth
hkdf scsi_transport_sas sp5100_tco wmi sunrpc dm_mirror dm_region_hash
dm_log dm_mod nfnetlink [last unloaded: nvmet]
--
Best Regards,
Yi Zhang
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-07 16:39 [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 Yi Zhang @ 2026-01-07 16:48 ` Jens Axboe 2026-01-08 6:39 ` Yi Zhang 0 siblings, 1 reply; 13+ messages in thread From: Jens Axboe @ 2026-01-07 16:48 UTC (permalink / raw) To: Yi Zhang, linux-block; +Cc: Ming Lei, Shinichiro Kawasaki On 1/7/26 9:39 AM, Yi Zhang wrote: > Hi > The following issue[2] was triggered by blktests nvme/059 and it's nvme/049 presumably? > 100% reproduced with commit[1]. Please help check it and let me know > if you need any info/test for it. > Seems it's one regression, I will try to test with the latest > linux-block/for-next and also bisect it tomorrow. Doesn't reproduce for me on the current tree, but nothing since: > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 > Merge: 29cefd61e0c6 fcf463b92a08 > Author: Jens Axboe <axboe@kernel.dk> > Date: Tue Jan 6 05:48:07 2026 -0700 > > Merge branch 'for-7.0/blk-pvec' into for-next should have impacted that. So please do bisect. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-07 16:48 ` Jens Axboe @ 2026-01-08 6:39 ` Yi Zhang 2026-01-14 5:58 ` [bug report][bisected] " Yi Zhang 0 siblings, 1 reply; 13+ messages in thread From: Yi Zhang @ 2026-01-08 6:39 UTC (permalink / raw) To: Jens Axboe, fengnanchang; +Cc: linux-block, Ming Lei, Shinichiro Kawasaki On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote: > > On 1/7/26 9:39 AM, Yi Zhang wrote: > > Hi > > The following issue[2] was triggered by blktests nvme/059 and it's > > nvme/049 presumably? > Yes. > > 100% reproduced with commit[1]. Please help check it and let me know > > if you need any info/test for it. > > Seems it's one regression, I will try to test with the latest > > linux-block/for-next and also bisect it tomorrow. > > Doesn't reproduce for me on the current tree, but nothing since: > > > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 > > Merge: 29cefd61e0c6 fcf463b92a08 > > Author: Jens Axboe <axboe@kernel.dk> > > Date: Tue Jan 6 05:48:07 2026 -0700 > > > > Merge branch 'for-7.0/blk-pvec' into for-next > > should have impacted that. So please do bisect. Hi Jens The issue seems was introduced from below commit. and the issue cannot be reproduced after reverting this commit. 3c7d76d6128a io_uring: IOPOLL polling improvements > > -- > Jens Axboe > -- Best Regards, Yi Zhang ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-08 6:39 ` Yi Zhang @ 2026-01-14 5:58 ` Yi Zhang 2026-01-14 9:40 ` Alexander Atanasov 2026-01-14 14:11 ` Ming Lei 0 siblings, 2 replies; 13+ messages in thread From: Yi Zhang @ 2026-01-14 5:58 UTC (permalink / raw) To: Jens Axboe, fengnanchang; +Cc: linux-block, Ming Lei, Shinichiro Kawasaki On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote: > > On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote: > > > > On 1/7/26 9:39 AM, Yi Zhang wrote: > > > Hi > > > The following issue[2] was triggered by blktests nvme/059 and it's > > > > nvme/049 presumably? > > > Yes. > > > > 100% reproduced with commit[1]. Please help check it and let me know > > > if you need any info/test for it. > > > Seems it's one regression, I will try to test with the latest > > > linux-block/for-next and also bisect it tomorrow. > > > > Doesn't reproduce for me on the current tree, but nothing since: > > > > > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 > > > Merge: 29cefd61e0c6 fcf463b92a08 > > > Author: Jens Axboe <axboe@kernel.dk> > > > Date: Tue Jan 6 05:48:07 2026 -0700 > > > > > > Merge branch 'for-7.0/blk-pvec' into for-next > > > > should have impacted that. So please do bisect. > > Hi Jens > The issue seems was introduced from below commit. > and the issue cannot be reproduced after reverting this commit. The issue still can be reproduced on the latest linux-block/for-next > > 3c7d76d6128a io_uring: IOPOLL polling improvements > > > > > -- > > Jens Axboe > > > > > -- > Best Regards, > Yi Zhang -- Best Regards, Yi Zhang ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-14 5:58 ` [bug report][bisected] " Yi Zhang @ 2026-01-14 9:40 ` Alexander Atanasov 2026-01-14 12:43 ` Christoph Hellwig 2026-01-14 14:11 ` Ming Lei 1 sibling, 1 reply; 13+ messages in thread From: Alexander Atanasov @ 2026-01-14 9:40 UTC (permalink / raw) To: Yi Zhang, Jens Axboe, fengnanchang Cc: linux-block, Ming Lei, Shinichiro Kawasaki Hello Yi, On 14.01.26 7:58, Yi Zhang wrote: > On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote: >> >> On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote: >>> >>> On 1/7/26 9:39 AM, Yi Zhang wrote: >>>> Hi >>>> The following issue[2] was triggered by blktests nvme/059 and it's >>> >>> nvme/049 presumably? >>> >> Yes. >> >>>> 100% reproduced with commit[1]. Please help check it and let me know >>>> if you need any info/test for it. >>>> Seems it's one regression, I will try to test with the latest >>>> linux-block/for-next and also bisect it tomorrow. >>> >>> Doesn't reproduce for me on the current tree, but nothing since: >>> >>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 >>>> Merge: 29cefd61e0c6 fcf463b92a08 >>>> Author: Jens Axboe <axboe@kernel.dk> >>>> Date: Tue Jan 6 05:48:07 2026 -0700 >>>> >>>> Merge branch 'for-7.0/blk-pvec' into for-next >>> >>> should have impacted that. So please do bisect. >> >> Hi Jens >> The issue seems was introduced from below commit. >> and the issue cannot be reproduced after reverting this commit. > > The issue still can be reproduced on the latest linux-block/for-next > >> >> 3c7d76d6128a io_uring: IOPOLL polling improvements Double linked lists require init, single lists do not (including io_wq_work_list). iopoll_node is never list_init-ed. So init before adding. Can you check if this fixes it for you? If yes, i will submit it as a proper patch - no way to test it at the moment. -- have fun, alex diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index cac292d103f1..fba0ae0cbf7b 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1679,6 +1679,7 @@ static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags) ctx->poll_multi_queue = true; } + list_init(&&req->iopoll_node); list_add_tail(&req->iopoll_node, &ctx->iopoll_list); if (unlikely(needs_lock)) { ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-14 9:40 ` Alexander Atanasov @ 2026-01-14 12:43 ` Christoph Hellwig 0 siblings, 0 replies; 13+ messages in thread From: Christoph Hellwig @ 2026-01-14 12:43 UTC (permalink / raw) To: alex+zkern Cc: Yi Zhang, Jens Axboe, fengnanchang, linux-block, Ming Lei, Shinichiro Kawasaki On Wed, Jan 14, 2026 at 11:40:41AM +0200, Alexander Atanasov wrote: > Double linked lists require init, single lists do not (including > io_wq_work_list). iopoll_node is never list_init-ed. So init before adding. > > Can you check if this fixes it for you? If yes, i will submit it as a proper > patch - no way to test it at the moment. The heads (anchors) of lists need initializations. The entries added to the list do not. I know this is a bit confusing because they use the same time, but besides not compiling due to the double-&, the patch would not do anything even the version that would compile. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-14 5:58 ` [bug report][bisected] " Yi Zhang 2026-01-14 9:40 ` Alexander Atanasov @ 2026-01-14 14:11 ` Ming Lei 2026-01-14 14:43 ` Jens Axboe 2026-01-16 11:54 ` Alexander Atanasov 1 sibling, 2 replies; 13+ messages in thread From: Ming Lei @ 2026-01-14 14:11 UTC (permalink / raw) To: Yi Zhang; +Cc: Jens Axboe, fengnanchang, linux-block, Shinichiro Kawasaki On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote: > On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote: > > > > On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote: > > > > > > On 1/7/26 9:39 AM, Yi Zhang wrote: > > > > Hi > > > > The following issue[2] was triggered by blktests nvme/059 and it's > > > > > > nvme/049 presumably? > > > > > Yes. > > > > > > 100% reproduced with commit[1]. Please help check it and let me know > > > > if you need any info/test for it. > > > > Seems it's one regression, I will try to test with the latest > > > > linux-block/for-next and also bisect it tomorrow. > > > > > > Doesn't reproduce for me on the current tree, but nothing since: > > > > > > > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 > > > > Merge: 29cefd61e0c6 fcf463b92a08 > > > > Author: Jens Axboe <axboe@kernel.dk> > > > > Date: Tue Jan 6 05:48:07 2026 -0700 > > > > > > > > Merge branch 'for-7.0/blk-pvec' into for-next > > > > > > should have impacted that. So please do bisect. > > > > Hi Jens > > The issue seems was introduced from below commit. > > and the issue cannot be reproduced after reverting this commit. > > The issue still can be reproduced on the latest linux-block/for-next Hi Yi, Can you try the following patch? diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index a9c097dacad6..7b0e62b8322b 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, pdu->result = le64_to_cpu(nvme_req(req)->result.u64); /* - * IOPOLL could potentially complete this request directly, but - * if multiple rings are polling on the same queue, then it's possible - * for one ring to find completions for another ring. Punting the - * completion via task_work will always direct it to the right - * location, rather than potentially complete requests for ringA - * under iopoll invocations from ringB. + * For IOPOLL, complete the request inline. The request's io_kiocb + * uses a union for io_task_work and iopoll_node, so scheduling + * task_work would corrupt the iopoll_list while the request is + * still on it. io_uring_cmd_done() handles IOPOLL by setting + * iopoll_completed rather than scheduling task_work. + * + * For non-IOPOLL, complete via task_work to ensure we run in the + * submitter's context and handling multiple rings is safe. */ - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); + if (blk_rq_is_poll(req)) { + if (pdu->bio) + blk_rq_unmap_user(pdu->bio); + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0); + } else { + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); + } + return RQ_END_IO_FREE; } Thanks, Ming ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-14 14:11 ` Ming Lei @ 2026-01-14 14:43 ` Jens Axboe 2026-01-14 14:58 ` Jens Axboe 2026-01-16 11:54 ` Alexander Atanasov 1 sibling, 1 reply; 13+ messages in thread From: Jens Axboe @ 2026-01-14 14:43 UTC (permalink / raw) To: Ming Lei, Yi Zhang; +Cc: fengnanchang, linux-block, Shinichiro Kawasaki On 1/14/26 7:11 AM, Ming Lei wrote: > On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote: >> On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote: >>> >>> On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote: >>>> >>>> On 1/7/26 9:39 AM, Yi Zhang wrote: >>>>> Hi >>>>> The following issue[2] was triggered by blktests nvme/059 and it's >>>> >>>> nvme/049 presumably? >>>> >>> Yes. >>> >>>>> 100% reproduced with commit[1]. Please help check it and let me know >>>>> if you need any info/test for it. >>>>> Seems it's one regression, I will try to test with the latest >>>>> linux-block/for-next and also bisect it tomorrow. >>>> >>>> Doesn't reproduce for me on the current tree, but nothing since: >>>> >>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 >>>>> Merge: 29cefd61e0c6 fcf463b92a08 >>>>> Author: Jens Axboe <axboe@kernel.dk> >>>>> Date: Tue Jan 6 05:48:07 2026 -0700 >>>>> >>>>> Merge branch 'for-7.0/blk-pvec' into for-next >>>> >>>> should have impacted that. So please do bisect. >>> >>> Hi Jens >>> The issue seems was introduced from below commit. >>> and the issue cannot be reproduced after reverting this commit. >> >> The issue still can be reproduced on the latest linux-block/for-next > > Hi Yi, > > Can you try the following patch? > > > diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c > index a9c097dacad6..7b0e62b8322b 100644 > --- a/drivers/nvme/host/ioctl.c > +++ b/drivers/nvme/host/ioctl.c > @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, > pdu->result = le64_to_cpu(nvme_req(req)->result.u64); > > /* > - * IOPOLL could potentially complete this request directly, but > - * if multiple rings are polling on the same queue, then it's possible > - * for one ring to find completions for another ring. Punting the > - * completion via task_work will always direct it to the right > - * location, rather than potentially complete requests for ringA > - * under iopoll invocations from ringB. > + * For IOPOLL, complete the request inline. The request's io_kiocb > + * uses a union for io_task_work and iopoll_node, so scheduling > + * task_work would corrupt the iopoll_list while the request is > + * still on it. io_uring_cmd_done() handles IOPOLL by setting > + * iopoll_completed rather than scheduling task_work. > + * > + * For non-IOPOLL, complete via task_work to ensure we run in the > + * submitter's context and handling multiple rings is safe. > */ > - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); > + if (blk_rq_is_poll(req)) { > + if (pdu->bio) > + blk_rq_unmap_user(pdu->bio); > + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0); > + } else { > + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); > + } > + > return RQ_END_IO_FREE; > } > Ah yes that should fix it, the task_work addition will conflict with the list addition. Don't think it's safe though, which is why I made them all use task_work previously. Let me fix it in the IOPOLL patch instead. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-14 14:43 ` Jens Axboe @ 2026-01-14 14:58 ` Jens Axboe 2026-01-14 15:20 ` Ming Lei 0 siblings, 1 reply; 13+ messages in thread From: Jens Axboe @ 2026-01-14 14:58 UTC (permalink / raw) To: Ming Lei, Yi Zhang; +Cc: fengnanchang, linux-block, Shinichiro Kawasaki On 1/14/26 7:43 AM, Jens Axboe wrote: > On 1/14/26 7:11 AM, Ming Lei wrote: >> On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote: >>> On Thu, Jan 8, 2026 at 2:39?PM Yi Zhang <yi.zhang@redhat.com> wrote: >>>> >>>> On Thu, Jan 8, 2026 at 12:48?AM Jens Axboe <axboe@kernel.dk> wrote: >>>>> >>>>> On 1/7/26 9:39 AM, Yi Zhang wrote: >>>>>> Hi >>>>>> The following issue[2] was triggered by blktests nvme/059 and it's >>>>> >>>>> nvme/049 presumably? >>>>> >>>> Yes. >>>> >>>>>> 100% reproduced with commit[1]. Please help check it and let me know >>>>>> if you need any info/test for it. >>>>>> Seems it's one regression, I will try to test with the latest >>>>>> linux-block/for-next and also bisect it tomorrow. >>>>> >>>>> Doesn't reproduce for me on the current tree, but nothing since: >>>>> >>>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 >>>>>> Merge: 29cefd61e0c6 fcf463b92a08 >>>>>> Author: Jens Axboe <axboe@kernel.dk> >>>>>> Date: Tue Jan 6 05:48:07 2026 -0700 >>>>>> >>>>>> Merge branch 'for-7.0/blk-pvec' into for-next >>>>> >>>>> should have impacted that. So please do bisect. >>>> >>>> Hi Jens >>>> The issue seems was introduced from below commit. >>>> and the issue cannot be reproduced after reverting this commit. >>> >>> The issue still can be reproduced on the latest linux-block/for-next >> >> Hi Yi, >> >> Can you try the following patch? >> >> >> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c >> index a9c097dacad6..7b0e62b8322b 100644 >> --- a/drivers/nvme/host/ioctl.c >> +++ b/drivers/nvme/host/ioctl.c >> @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, >> pdu->result = le64_to_cpu(nvme_req(req)->result.u64); >> >> /* >> - * IOPOLL could potentially complete this request directly, but >> - * if multiple rings are polling on the same queue, then it's possible >> - * for one ring to find completions for another ring. Punting the >> - * completion via task_work will always direct it to the right >> - * location, rather than potentially complete requests for ringA >> - * under iopoll invocations from ringB. >> + * For IOPOLL, complete the request inline. The request's io_kiocb >> + * uses a union for io_task_work and iopoll_node, so scheduling >> + * task_work would corrupt the iopoll_list while the request is >> + * still on it. io_uring_cmd_done() handles IOPOLL by setting >> + * iopoll_completed rather than scheduling task_work. >> + * >> + * For non-IOPOLL, complete via task_work to ensure we run in the >> + * submitter's context and handling multiple rings is safe. >> */ >> - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); >> + if (blk_rq_is_poll(req)) { >> + if (pdu->bio) >> + blk_rq_unmap_user(pdu->bio); >> + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0); >> + } else { >> + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); >> + } >> + >> return RQ_END_IO_FREE; >> } >> > > Ah yes that should fix it, the task_work addition will conflict with > the list addition. Don't think it's safe though, which is why I made > them all use task_work previously. Let me fix it in the IOPOLL patch > instead. This should be better: diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index dd084a55bed8..1fa8d829cbac 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -719,13 +719,10 @@ struct io_kiocb { atomic_t refs; bool cancel_seq_set; - /* - * IOPOLL doesn't use task_work, so use the ->iopoll_node list - * entry to manage pending iopoll requests. - */ union { struct io_task_work io_task_work; - struct list_head iopoll_node; + /* For IOPOLL setup queues, with hybrid polling */ + u64 iopoll_start; }; union { @@ -734,8 +731,8 @@ struct io_kiocb { * poll */ struct hlist_node hash_node; - /* For IOPOLL setup queues, with hybrid polling */ - u64 iopoll_start; + /* IOPOLL completion handling */ + struct list_head iopoll_node; /* for private io_kiocb freeing */ struct rcu_head rcu_head; }; -- Jens Axboe ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-14 14:58 ` Jens Axboe @ 2026-01-14 15:20 ` Ming Lei 2026-01-14 15:26 ` Jens Axboe 0 siblings, 1 reply; 13+ messages in thread From: Ming Lei @ 2026-01-14 15:20 UTC (permalink / raw) To: Jens Axboe; +Cc: Yi Zhang, fengnanchang, linux-block, Shinichiro Kawasaki On Wed, Jan 14, 2026 at 07:58:54AM -0700, Jens Axboe wrote: > On 1/14/26 7:43 AM, Jens Axboe wrote: > > On 1/14/26 7:11 AM, Ming Lei wrote: > >> On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote: > >>> On Thu, Jan 8, 2026 at 2:39?PM Yi Zhang <yi.zhang@redhat.com> wrote: > >>>> > >>>> On Thu, Jan 8, 2026 at 12:48?AM Jens Axboe <axboe@kernel.dk> wrote: > >>>>> > >>>>> On 1/7/26 9:39 AM, Yi Zhang wrote: > >>>>>> Hi > >>>>>> The following issue[2] was triggered by blktests nvme/059 and it's > >>>>> > >>>>> nvme/049 presumably? > >>>>> > >>>> Yes. > >>>> > >>>>>> 100% reproduced with commit[1]. Please help check it and let me know > >>>>>> if you need any info/test for it. > >>>>>> Seems it's one regression, I will try to test with the latest > >>>>>> linux-block/for-next and also bisect it tomorrow. > >>>>> > >>>>> Doesn't reproduce for me on the current tree, but nothing since: > >>>>> > >>>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 > >>>>>> Merge: 29cefd61e0c6 fcf463b92a08 > >>>>>> Author: Jens Axboe <axboe@kernel.dk> > >>>>>> Date: Tue Jan 6 05:48:07 2026 -0700 > >>>>>> > >>>>>> Merge branch 'for-7.0/blk-pvec' into for-next > >>>>> > >>>>> should have impacted that. So please do bisect. > >>>> > >>>> Hi Jens > >>>> The issue seems was introduced from below commit. > >>>> and the issue cannot be reproduced after reverting this commit. > >>> > >>> The issue still can be reproduced on the latest linux-block/for-next > >> > >> Hi Yi, > >> > >> Can you try the following patch? > >> > >> > >> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c > >> index a9c097dacad6..7b0e62b8322b 100644 > >> --- a/drivers/nvme/host/ioctl.c > >> +++ b/drivers/nvme/host/ioctl.c > >> @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, > >> pdu->result = le64_to_cpu(nvme_req(req)->result.u64); > >> > >> /* > >> - * IOPOLL could potentially complete this request directly, but > >> - * if multiple rings are polling on the same queue, then it's possible > >> - * for one ring to find completions for another ring. Punting the > >> - * completion via task_work will always direct it to the right > >> - * location, rather than potentially complete requests for ringA > >> - * under iopoll invocations from ringB. > >> + * For IOPOLL, complete the request inline. The request's io_kiocb > >> + * uses a union for io_task_work and iopoll_node, so scheduling > >> + * task_work would corrupt the iopoll_list while the request is > >> + * still on it. io_uring_cmd_done() handles IOPOLL by setting > >> + * iopoll_completed rather than scheduling task_work. > >> + * > >> + * For non-IOPOLL, complete via task_work to ensure we run in the > >> + * submitter's context and handling multiple rings is safe. > >> */ > >> - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); > >> + if (blk_rq_is_poll(req)) { > >> + if (pdu->bio) > >> + blk_rq_unmap_user(pdu->bio); > >> + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0); > >> + } else { > >> + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); > >> + } > >> + > >> return RQ_END_IO_FREE; > >> } > >> > > > > Ah yes that should fix it, the task_work addition will conflict with > > the list addition. Don't think it's safe though, which is why I made > > them all use task_work previously. Let me fix it in the IOPOLL patch > > instead. > > This should be better: > > diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h > index dd084a55bed8..1fa8d829cbac 100644 > --- a/include/linux/io_uring_types.h > +++ b/include/linux/io_uring_types.h > @@ -719,13 +719,10 @@ struct io_kiocb { > atomic_t refs; > bool cancel_seq_set; > > - /* > - * IOPOLL doesn't use task_work, so use the ->iopoll_node list > - * entry to manage pending iopoll requests. > - */ > union { > struct io_task_work io_task_work; > - struct list_head iopoll_node; > + /* For IOPOLL setup queues, with hybrid polling */ > + u64 iopoll_start; > }; > > union { > @@ -734,8 +731,8 @@ struct io_kiocb { > * poll > */ > struct hlist_node hash_node; > - /* For IOPOLL setup queues, with hybrid polling */ > - u64 iopoll_start; > + /* IOPOLL completion handling */ > + struct list_head iopoll_node; > /* for private io_kiocb freeing */ > struct rcu_head rcu_head; > }; This way looks better, just `req->iopoll_start` needs to read to local variable first in io_uring_hybrid_poll(). Thanks, Ming ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-14 15:20 ` Ming Lei @ 2026-01-14 15:26 ` Jens Axboe 0 siblings, 0 replies; 13+ messages in thread From: Jens Axboe @ 2026-01-14 15:26 UTC (permalink / raw) To: Ming Lei; +Cc: Yi Zhang, fengnanchang, linux-block, Shinichiro Kawasaki On 1/14/26 8:20 AM, Ming Lei wrote: > On Wed, Jan 14, 2026 at 07:58:54AM -0700, Jens Axboe wrote: >> On 1/14/26 7:43 AM, Jens Axboe wrote: >>> On 1/14/26 7:11 AM, Ming Lei wrote: >>>> On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote: >>>>> On Thu, Jan 8, 2026 at 2:39?PM Yi Zhang <yi.zhang@redhat.com> wrote: >>>>>> >>>>>> On Thu, Jan 8, 2026 at 12:48?AM Jens Axboe <axboe@kernel.dk> wrote: >>>>>>> >>>>>>> On 1/7/26 9:39 AM, Yi Zhang wrote: >>>>>>>> Hi >>>>>>>> The following issue[2] was triggered by blktests nvme/059 and it's >>>>>>> >>>>>>> nvme/049 presumably? >>>>>>> >>>>>> Yes. >>>>>> >>>>>>>> 100% reproduced with commit[1]. Please help check it and let me know >>>>>>>> if you need any info/test for it. >>>>>>>> Seems it's one regression, I will try to test with the latest >>>>>>>> linux-block/for-next and also bisect it tomorrow. >>>>>>> >>>>>>> Doesn't reproduce for me on the current tree, but nothing since: >>>>>>> >>>>>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 >>>>>>>> Merge: 29cefd61e0c6 fcf463b92a08 >>>>>>>> Author: Jens Axboe <axboe@kernel.dk> >>>>>>>> Date: Tue Jan 6 05:48:07 2026 -0700 >>>>>>>> >>>>>>>> Merge branch 'for-7.0/blk-pvec' into for-next >>>>>>> >>>>>>> should have impacted that. So please do bisect. >>>>>> >>>>>> Hi Jens >>>>>> The issue seems was introduced from below commit. >>>>>> and the issue cannot be reproduced after reverting this commit. >>>>> >>>>> The issue still can be reproduced on the latest linux-block/for-next >>>> >>>> Hi Yi, >>>> >>>> Can you try the following patch? >>>> >>>> >>>> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c >>>> index a9c097dacad6..7b0e62b8322b 100644 >>>> --- a/drivers/nvme/host/ioctl.c >>>> +++ b/drivers/nvme/host/ioctl.c >>>> @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, >>>> pdu->result = le64_to_cpu(nvme_req(req)->result.u64); >>>> >>>> /* >>>> - * IOPOLL could potentially complete this request directly, but >>>> - * if multiple rings are polling on the same queue, then it's possible >>>> - * for one ring to find completions for another ring. Punting the >>>> - * completion via task_work will always direct it to the right >>>> - * location, rather than potentially complete requests for ringA >>>> - * under iopoll invocations from ringB. >>>> + * For IOPOLL, complete the request inline. The request's io_kiocb >>>> + * uses a union for io_task_work and iopoll_node, so scheduling >>>> + * task_work would corrupt the iopoll_list while the request is >>>> + * still on it. io_uring_cmd_done() handles IOPOLL by setting >>>> + * iopoll_completed rather than scheduling task_work. >>>> + * >>>> + * For non-IOPOLL, complete via task_work to ensure we run in the >>>> + * submitter's context and handling multiple rings is safe. >>>> */ >>>> - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); >>>> + if (blk_rq_is_poll(req)) { >>>> + if (pdu->bio) >>>> + blk_rq_unmap_user(pdu->bio); >>>> + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0); >>>> + } else { >>>> + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); >>>> + } >>>> + >>>> return RQ_END_IO_FREE; >>>> } >>>> >>> >>> Ah yes that should fix it, the task_work addition will conflict with >>> the list addition. Don't think it's safe though, which is why I made >>> them all use task_work previously. Let me fix it in the IOPOLL patch >>> instead. >> >> This should be better: >> >> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h >> index dd084a55bed8..1fa8d829cbac 100644 >> --- a/include/linux/io_uring_types.h >> +++ b/include/linux/io_uring_types.h >> @@ -719,13 +719,10 @@ struct io_kiocb { >> atomic_t refs; >> bool cancel_seq_set; >> >> - /* >> - * IOPOLL doesn't use task_work, so use the ->iopoll_node list >> - * entry to manage pending iopoll requests. >> - */ >> union { >> struct io_task_work io_task_work; >> - struct list_head iopoll_node; >> + /* For IOPOLL setup queues, with hybrid polling */ >> + u64 iopoll_start; >> }; >> >> union { >> @@ -734,8 +731,8 @@ struct io_kiocb { >> * poll >> */ >> struct hlist_node hash_node; >> - /* For IOPOLL setup queues, with hybrid polling */ >> - u64 iopoll_start; >> + /* IOPOLL completion handling */ >> + struct list_head iopoll_node; >> /* for private io_kiocb freeing */ >> struct rcu_head rcu_head; >> }; > > This way looks better, just `req->iopoll_start` needs to read to local > variable first in io_uring_hybrid_poll(). True, let me send out a v2. -- Jens Axboe ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-14 14:11 ` Ming Lei 2026-01-14 14:43 ` Jens Axboe @ 2026-01-16 11:54 ` Alexander Atanasov 2026-01-16 12:41 ` Ming Lei 1 sibling, 1 reply; 13+ messages in thread From: Alexander Atanasov @ 2026-01-16 11:54 UTC (permalink / raw) To: Ming Lei, Yi Zhang Cc: Jens Axboe, fengnanchang, linux-block, Shinichiro Kawasaki Hello Ming, On 14.01.26 16:11, Ming Lei wrote: > On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote: >> On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote: >>> >>> On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote: >>>> >>>> On 1/7/26 9:39 AM, Yi Zhang wrote: >>>>> Hi >>>>> The following issue[2] was triggered by blktests nvme/059 and it's >>>> >>>> nvme/049 presumably? >>>> >>> Yes. >>> >>>>> 100% reproduced with commit[1]. Please help check it and let me know >>>>> if you need any info/test for it. >>>>> Seems it's one regression, I will try to test with the latest >>>>> linux-block/for-next and also bisect it tomorrow. >>>> >>>> Doesn't reproduce for me on the current tree, but nothing since: >>>> >>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 >>>>> Merge: 29cefd61e0c6 fcf463b92a08 >>>>> Author: Jens Axboe <axboe@kernel.dk> >>>>> Date: Tue Jan 6 05:48:07 2026 -0700 >>>>> >>>>> Merge branch 'for-7.0/blk-pvec' into for-next >>>> >>>> should have impacted that. So please do bisect. >>> >>> Hi Jens >>> The issue seems was introduced from below commit. >>> and the issue cannot be reproduced after reverting this commit. >> >> The issue still can be reproduced on the latest linux-block/for-next > > Hi Yi, > > Can you try the following patch? > > > diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c > index a9c097dacad6..7b0e62b8322b 100644 > --- a/drivers/nvme/host/ioctl.c > +++ b/drivers/nvme/host/ioctl.c > @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, > pdu->result = le64_to_cpu(nvme_req(req)->result.u64); > > /* > - * IOPOLL could potentially complete this request directly, but > - * if multiple rings are polling on the same queue, then it's possible > - * for one ring to find completions for another ring. Punting the > - * completion via task_work will always direct it to the right > - * location, rather than potentially complete requests for ringA > - * under iopoll invocations from ringB. > + * For IOPOLL, complete the request inline. The request's io_kiocb > + * uses a union for io_task_work and iopoll_node, so scheduling > + * task_work would corrupt the iopoll_list while the request is > + * still on it. io_uring_cmd_done() handles IOPOLL by setting > + * iopoll_completed rather than scheduling task_work. > + * > + * For non-IOPOLL, complete via task_work to ensure we run in the > + * submitter's context and handling multiple rings is safe. > */ > - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); > + if (blk_rq_is_poll(req)) { > + if (pdu->bio) > + blk_rq_unmap_user(pdu->bio); > + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0); > + } else { > + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); > + } > + > return RQ_END_IO_FREE; > } While this is a good optimisation and it will fix the list issue for a single user - it may crash with multiple users of the context. I am still learning this code, so excuse my ignorance here and there. The bisected patch 3c7d76d6128a changed io_wq_work_list which looks like safe to be used without locks (it is a derivate of llist) , list_head require proper locking to be safe. ctx can be used to poll multiple files, iopoll_list is a list for that reason. sqpoll is calling io_iopoll_req_issued without lock -> it does list_add_tail if that races with other list addition or deletion it will corrupt the list. is there any mechanism to prevent that? or i am missing something? -- have fun, alex ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 2026-01-16 11:54 ` Alexander Atanasov @ 2026-01-16 12:41 ` Ming Lei 0 siblings, 0 replies; 13+ messages in thread From: Ming Lei @ 2026-01-16 12:41 UTC (permalink / raw) To: alex+zkern Cc: Yi Zhang, Jens Axboe, fengnanchang, linux-block, Shinichiro Kawasaki On Fri, Jan 16, 2026 at 01:54:15PM +0200, Alexander Atanasov wrote: > Hello Ming, > > On 14.01.26 16:11, Ming Lei wrote: > > On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote: > > > On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote: > > > > > > > > On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote: > > > > > > > > > > On 1/7/26 9:39 AM, Yi Zhang wrote: > > > > > > Hi > > > > > > The following issue[2] was triggered by blktests nvme/059 and it's > > > > > > > > > > nvme/049 presumably? > > > > > > > > > Yes. > > > > > > > > > > 100% reproduced with commit[1]. Please help check it and let me know > > > > > > if you need any info/test for it. > > > > > > Seems it's one regression, I will try to test with the latest > > > > > > linux-block/for-next and also bisect it tomorrow. > > > > > > > > > > Doesn't reproduce for me on the current tree, but nothing since: > > > > > > > > > > > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11 > > > > > > Merge: 29cefd61e0c6 fcf463b92a08 > > > > > > Author: Jens Axboe <axboe@kernel.dk> > > > > > > Date: Tue Jan 6 05:48:07 2026 -0700 > > > > > > > > > > > > Merge branch 'for-7.0/blk-pvec' into for-next > > > > > > > > > > should have impacted that. So please do bisect. > > > > > > > > Hi Jens > > > > The issue seems was introduced from below commit. > > > > and the issue cannot be reproduced after reverting this commit. > > > > > > The issue still can be reproduced on the latest linux-block/for-next > > > > Hi Yi, > > > > Can you try the following patch? > > > > > > diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c > > index a9c097dacad6..7b0e62b8322b 100644 > > --- a/drivers/nvme/host/ioctl.c > > +++ b/drivers/nvme/host/ioctl.c > > @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, > > pdu->result = le64_to_cpu(nvme_req(req)->result.u64); > > /* > > - * IOPOLL could potentially complete this request directly, but > > - * if multiple rings are polling on the same queue, then it's possible > > - * for one ring to find completions for another ring. Punting the > > - * completion via task_work will always direct it to the right > > - * location, rather than potentially complete requests for ringA > > - * under iopoll invocations from ringB. > > + * For IOPOLL, complete the request inline. The request's io_kiocb > > + * uses a union for io_task_work and iopoll_node, so scheduling > > + * task_work would corrupt the iopoll_list while the request is > > + * still on it. io_uring_cmd_done() handles IOPOLL by setting > > + * iopoll_completed rather than scheduling task_work. > > + * > > + * For non-IOPOLL, complete via task_work to ensure we run in the > > + * submitter's context and handling multiple rings is safe. > > */ > > - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); > > + if (blk_rq_is_poll(req)) { > > + if (pdu->bio) > > + blk_rq_unmap_user(pdu->bio); > > + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0); > > + } else { > > + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); > > + } > > + > > return RQ_END_IO_FREE; > > } > > > While this is a good optimisation and it will fix the list issue for a > single user - it may crash with multiple users of the context. I am still > learning this code, so excuse my ignorance here and there. Jens has sent the following fix already: https://lore.kernel.org/io-uring/aWhGEMsaOf752f5z@fedora/T/#t > > The bisected patch 3c7d76d6128a changed io_wq_work_list which looks like > safe to be used without locks (it is a derivate of llist) , list_head > require proper locking to be safe. > > ctx can be used to poll multiple files, iopoll_list is a list for that > reason. > sqpoll is calling io_iopoll_req_issued without lock -> it does list_add_tail > if that races with other list addition or deletion it will corrupt the list. > > is there any mechanism to prevent that? or i am missing something? io_iopoll_req_issued() will grab ctx->uring_lock if it isn't held. Thanks, Ming ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-01-16 12:41 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-07 16:39 [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 Yi Zhang 2026-01-07 16:48 ` Jens Axboe 2026-01-08 6:39 ` Yi Zhang 2026-01-14 5:58 ` [bug report][bisected] " Yi Zhang 2026-01-14 9:40 ` Alexander Atanasov 2026-01-14 12:43 ` Christoph Hellwig 2026-01-14 14:11 ` Ming Lei 2026-01-14 14:43 ` Jens Axboe 2026-01-14 14:58 ` Jens Axboe 2026-01-14 15:20 ` Ming Lei 2026-01-14 15:26 ` Jens Axboe 2026-01-16 11:54 ` Alexander Atanasov 2026-01-16 12:41 ` Ming Lei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox