* [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
@ 2026-01-07 16:39 Yi Zhang
2026-01-07 16:48 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Yi Zhang @ 2026-01-07 16:39 UTC (permalink / raw)
To: linux-block; +Cc: Jens Axboe, Ming Lei, Shinichiro Kawasaki
Hi
The following issue[2] was triggered by blktests nvme/059 and it's
100% reproduced with commit[1]. Please help check it and let me know
if you need any info/test for it.
Seems it's one regression, I will try to test with the latest
linux-block/for-next and also bisect it tomorrow.
[1]
commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
Merge: 29cefd61e0c6 fcf463b92a08
Author: Jens Axboe <axboe@kernel.dk>
Date: Tue Jan 6 05:48:07 2026 -0700
Merge branch 'for-7.0/blk-pvec' into for-next
* for-7.0/blk-pvec:
types: move phys_vec definition to common header
nvme-pci: Use size_t for length fields to handle larger sizes
[2]
[16866.579229] run blktests nvme/049 at 2026-01-07 02:00:14
[16869.709147] slab io_kiocb start ffff88825e6ad400 pointer offset 144 size 248
[16869.716399] list_add corruption. prev->next should be next
(ffff888200596100), but was 0000000000000000. (prev=ffff88825e6ad490).
[16869.728106] ------------[ cut here ]------------
[16869.732738] kernel BUG at lib/list_debug.c:32!
[16869.737209] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
[16869.742790] CPU: 15 UID: 0 PID: 71799 Comm: fio Kdump: loaded Not
tainted 6.19.0-rc3+ #1 PREEMPT(voluntary)
[16869.752614] Hardware name: Dell Inc. PowerEdge R6515/07PXPY, BIOS
2.21.1 09/24/2025
[16869.760267] RIP: 0010:__list_add_valid_or_report+0xf9/0x130
[16869.765849] Code: 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c
02 00 75 3c 49 8b 55 00 4c 89 e9 48 89 de 48 c7 c7 40 6d f6 9e e8 67
e1 a1 fe <0f> 0b 4c 89 e7 e8 8d eb 78 ff e9 3c ff ff ff 4c 89 ef e8 80
eb 78
[16869.784600] RSP: 0018:ffffc9000aadf990 EFLAGS: 00010282
[16869.789835] RAX: 0000000000000075 RBX: ffff888200596100 RCX: 0000000000000000
[16869.796967] RDX: 0000000000000075 RSI: ffffffff9ef66980 RDI: fffff5200155bf24
[16869.804101] RBP: ffff88825e6adc10 R08: 0000000000000001 R09: fffff5200155bee6
[16869.811234] R10: ffffc9000aadf737 R11: 0000000000000001 R12: ffff888200596108
[16869.818366] R13: ffff88825e6ad490 R14: ffff88825e6ad490 R15: ffff88825e6adc10
[16869.825500] FS: 00007f01a51bb740(0000) GS:ffff88887f6c4000(0000)
knlGS:0000000000000000
[16869.833591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[16869.839338] CR2: 00007f019cdb7430 CR3: 00000001eeeae000 CR4: 0000000000350ef0
[16869.846469] Call Trace:
[16869.848923] <TASK>
[16869.851034] io_issue_sqe+0x7eb/0xdd0
[16869.854707] ? srso_return_thunk+0x5/0x5f
[16869.858725] ? io_uring_cmd_prep+0x350/0x560
[16869.863012] io_submit_sqes+0x475/0x1000
[16869.866942] ? srso_return_thunk+0x5/0x5f
[16869.870969] ? __pfx_io_submit_sqes+0x10/0x10
[16869.875332] ? srso_return_thunk+0x5/0x5f
[16869.879352] ? __fget_files+0x1b6/0x2f0
[16869.883208] __do_sys_io_uring_enter+0x433/0x820
[16869.887829] ? fput+0x4c/0xa0
[16869.890809] ? __pfx___do_sys_io_uring_enter+0x10/0x10
[16869.895958] ? srso_return_thunk+0x5/0x5f
[16869.899978] ? srso_return_thunk+0x5/0x5f
[16869.903999] ? rcu_is_watching+0x15/0xb0
[16869.907934] ? srso_return_thunk+0x5/0x5f
[16869.911953] ? trace_irq_enable.constprop.0+0x13d/0x190
[16869.917183] ? srso_return_thunk+0x5/0x5f
[16869.921203] ? syscall_trace_enter+0x13e/0x230
[16869.925656] ? srso_return_thunk+0x5/0x5f
[16869.929685] do_syscall_64+0x95/0x520
[16869.933363] ? srso_return_thunk+0x5/0x5f
[16869.937380] ? trace_irq_enable.constprop.0+0x13d/0x190
[16869.942608] ? srso_return_thunk+0x5/0x5f
[16869.946628] ? do_syscall_64+0x16d/0x520
[16869.950556] ? __pfx_pgd_none+0x10/0x10
[16869.954408] ? srso_return_thunk+0x5/0x5f
[16869.958424] ? __handle_mm_fault+0x97e/0x11d0
[16869.962795] ? __pfx_css_rstat_updated+0x10/0x10
[16869.967421] ? __pfx___handle_mm_fault+0x10/0x10
[16869.972050] ? srso_return_thunk+0x5/0x5f
[16869.976069] ? rcu_is_watching+0x15/0xb0
[16869.979995] ? srso_return_thunk+0x5/0x5f
[16869.984016] ? trace_count_memcg_events+0x14f/0x1a0
[16869.988905] ? srso_return_thunk+0x5/0x5f
[16869.992924] ? count_memcg_events+0xe5/0x370
[16869.997198] ? srso_return_thunk+0x5/0x5f
[16870.001218] ? srso_return_thunk+0x5/0x5f
[16870.005232] ? __up_read+0x2c5/0x700
[16870.008821] ? __pfx___up_read+0x10/0x10
[16870.012756] ? handle_mm_fault+0x452/0x8a0
[16870.016862] ? do_user_addr_fault+0x274/0xa60
[16870.021229] ? srso_return_thunk+0x5/0x5f
[16870.025241] ? rcu_is_watching+0x15/0xb0
[16870.029172] ? srso_return_thunk+0x5/0x5f
[16870.033189] ? rcu_is_watching+0x15/0xb0
[16870.037114] ? srso_return_thunk+0x5/0x5f
[16870.041126] ? trace_irq_enable.constprop.0+0x13d/0x190
[16870.046353] ? srso_return_thunk+0x5/0x5f
[16870.050368] ? srso_return_thunk+0x5/0x5f
[16870.054387] ? irqentry_exit+0x93/0x5f0
[16870.058229] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[16870.063288] RIP: 0033:0x558b6250d067
[16870.066876] Code: 24 94 00 00 00 85 f6 74 78 49 8b 44 24 20 41 8b
3c 24 45 31 c0 45 31 c9 41 ba 01 00 00 00 31 d2 44 8b 38 b8 aa 01 00
00 0f 05 <48> 89 c3 89 c5 85 c0 7e 90 89 c2 44 89 fe 4c 89 ef e8 c3 d6
ff ff
[16870.085630] RSP: 002b:00007ffc479b70b0 EFLAGS: 00000246 ORIG_RAX:
00000000000001aa
[16870.093205] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000558b6250d067
[16870.100335] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000007
[16870.107468] RBP: 00007f019cdb6000 R08: 0000000000000000 R09: 0000000000000000
[16870.114601] R10: 0000000000000001 R11: 0000000000000246 R12: 0000558b9f3eab00
[16870.121734] R13: 00007f019cdb6000 R14: 0000558b62527000 R15: 0000000000000001
[16870.128882] </TASK>
[16870.131077] Modules linked in: ext4 crc16 mbcache jbd2
rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace
nfs_localio netfs platform_profile dell_wmi dell_smbios intel_rapl_msr
amd_atl intel_rapl_common sparse_keymap amd64_edac rfkill edac_mce_amd
video vfat dcdbas fat kvm_amd cdc_ether usbnet kvm mii irqbypass
mgag200 wmi_bmof dell_wmi_descriptor rapl i2c_algo_bit pcspkr
acpi_cpufreq ipmi_ssif ptdma i2c_piix4 k10temp i2c_smbus
acpi_power_meter ipmi_si acpi_ipmi ipmi_devintf ipmi_msghandler sg
loop fuse xfs sd_mod nvme ahci libahci nvme_core mpt3sas
ghash_clmulni_intel tg3 nvme_keyring ccp libata raid_class nvme_auth
hkdf scsi_transport_sas sp5100_tco wmi sunrpc dm_mirror dm_region_hash
dm_log dm_mod nfnetlink [last unloaded: nvmet]
--
Best Regards,
Yi Zhang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-07 16:39 [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 Yi Zhang
@ 2026-01-07 16:48 ` Jens Axboe
2026-01-08 6:39 ` Yi Zhang
0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2026-01-07 16:48 UTC (permalink / raw)
To: Yi Zhang, linux-block; +Cc: Ming Lei, Shinichiro Kawasaki
On 1/7/26 9:39 AM, Yi Zhang wrote:
> Hi
> The following issue[2] was triggered by blktests nvme/059 and it's
nvme/049 presumably?
> 100% reproduced with commit[1]. Please help check it and let me know
> if you need any info/test for it.
> Seems it's one regression, I will try to test with the latest
> linux-block/for-next and also bisect it tomorrow.
Doesn't reproduce for me on the current tree, but nothing since:
> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
> Merge: 29cefd61e0c6 fcf463b92a08
> Author: Jens Axboe <axboe@kernel.dk>
> Date: Tue Jan 6 05:48:07 2026 -0700
>
> Merge branch 'for-7.0/blk-pvec' into for-next
should have impacted that. So please do bisect.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-07 16:48 ` Jens Axboe
@ 2026-01-08 6:39 ` Yi Zhang
2026-01-14 5:58 ` [bug report][bisected] " Yi Zhang
0 siblings, 1 reply; 13+ messages in thread
From: Yi Zhang @ 2026-01-08 6:39 UTC (permalink / raw)
To: Jens Axboe, fengnanchang; +Cc: linux-block, Ming Lei, Shinichiro Kawasaki
On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 1/7/26 9:39 AM, Yi Zhang wrote:
> > Hi
> > The following issue[2] was triggered by blktests nvme/059 and it's
>
> nvme/049 presumably?
>
Yes.
> > 100% reproduced with commit[1]. Please help check it and let me know
> > if you need any info/test for it.
> > Seems it's one regression, I will try to test with the latest
> > linux-block/for-next and also bisect it tomorrow.
>
> Doesn't reproduce for me on the current tree, but nothing since:
>
> > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
> > Merge: 29cefd61e0c6 fcf463b92a08
> > Author: Jens Axboe <axboe@kernel.dk>
> > Date: Tue Jan 6 05:48:07 2026 -0700
> >
> > Merge branch 'for-7.0/blk-pvec' into for-next
>
> should have impacted that. So please do bisect.
Hi Jens
The issue seems was introduced from below commit.
and the issue cannot be reproduced after reverting this commit.
3c7d76d6128a io_uring: IOPOLL polling improvements
>
> --
> Jens Axboe
>
--
Best Regards,
Yi Zhang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-08 6:39 ` Yi Zhang
@ 2026-01-14 5:58 ` Yi Zhang
2026-01-14 9:40 ` Alexander Atanasov
2026-01-14 14:11 ` Ming Lei
0 siblings, 2 replies; 13+ messages in thread
From: Yi Zhang @ 2026-01-14 5:58 UTC (permalink / raw)
To: Jens Axboe, fengnanchang; +Cc: linux-block, Ming Lei, Shinichiro Kawasaki
On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote:
>
> On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote:
> >
> > On 1/7/26 9:39 AM, Yi Zhang wrote:
> > > Hi
> > > The following issue[2] was triggered by blktests nvme/059 and it's
> >
> > nvme/049 presumably?
> >
> Yes.
>
> > > 100% reproduced with commit[1]. Please help check it and let me know
> > > if you need any info/test for it.
> > > Seems it's one regression, I will try to test with the latest
> > > linux-block/for-next and also bisect it tomorrow.
> >
> > Doesn't reproduce for me on the current tree, but nothing since:
> >
> > > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
> > > Merge: 29cefd61e0c6 fcf463b92a08
> > > Author: Jens Axboe <axboe@kernel.dk>
> > > Date: Tue Jan 6 05:48:07 2026 -0700
> > >
> > > Merge branch 'for-7.0/blk-pvec' into for-next
> >
> > should have impacted that. So please do bisect.
>
> Hi Jens
> The issue seems was introduced from below commit.
> and the issue cannot be reproduced after reverting this commit.
The issue still can be reproduced on the latest linux-block/for-next
>
> 3c7d76d6128a io_uring: IOPOLL polling improvements
>
> >
> > --
> > Jens Axboe
> >
>
>
> --
> Best Regards,
> Yi Zhang
--
Best Regards,
Yi Zhang
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-14 5:58 ` [bug report][bisected] " Yi Zhang
@ 2026-01-14 9:40 ` Alexander Atanasov
2026-01-14 12:43 ` Christoph Hellwig
2026-01-14 14:11 ` Ming Lei
1 sibling, 1 reply; 13+ messages in thread
From: Alexander Atanasov @ 2026-01-14 9:40 UTC (permalink / raw)
To: Yi Zhang, Jens Axboe, fengnanchang
Cc: linux-block, Ming Lei, Shinichiro Kawasaki
Hello Yi,
On 14.01.26 7:58, Yi Zhang wrote:
> On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote:
>>
>> On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> On 1/7/26 9:39 AM, Yi Zhang wrote:
>>>> Hi
>>>> The following issue[2] was triggered by blktests nvme/059 and it's
>>>
>>> nvme/049 presumably?
>>>
>> Yes.
>>
>>>> 100% reproduced with commit[1]. Please help check it and let me know
>>>> if you need any info/test for it.
>>>> Seems it's one regression, I will try to test with the latest
>>>> linux-block/for-next and also bisect it tomorrow.
>>>
>>> Doesn't reproduce for me on the current tree, but nothing since:
>>>
>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
>>>> Merge: 29cefd61e0c6 fcf463b92a08
>>>> Author: Jens Axboe <axboe@kernel.dk>
>>>> Date: Tue Jan 6 05:48:07 2026 -0700
>>>>
>>>> Merge branch 'for-7.0/blk-pvec' into for-next
>>>
>>> should have impacted that. So please do bisect.
>>
>> Hi Jens
>> The issue seems was introduced from below commit.
>> and the issue cannot be reproduced after reverting this commit.
>
> The issue still can be reproduced on the latest linux-block/for-next
>
>>
>> 3c7d76d6128a io_uring: IOPOLL polling improvements
Double linked lists require init, single lists do not (including
io_wq_work_list). iopoll_node is never list_init-ed. So init before adding.
Can you check if this fixes it for you? If yes, i will submit it as a
proper patch - no way to test it at the moment.
--
have fun,
alex
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index cac292d103f1..fba0ae0cbf7b 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1679,6 +1679,7 @@ static void io_iopoll_req_issued(struct io_kiocb
*req, unsigned int issue_flags)
ctx->poll_multi_queue = true;
}
+ list_init(&&req->iopoll_node);
list_add_tail(&req->iopoll_node, &ctx->iopoll_list);
if (unlikely(needs_lock)) {
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-14 9:40 ` Alexander Atanasov
@ 2026-01-14 12:43 ` Christoph Hellwig
0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2026-01-14 12:43 UTC (permalink / raw)
To: alex+zkern
Cc: Yi Zhang, Jens Axboe, fengnanchang, linux-block, Ming Lei,
Shinichiro Kawasaki
On Wed, Jan 14, 2026 at 11:40:41AM +0200, Alexander Atanasov wrote:
> Double linked lists require init, single lists do not (including
> io_wq_work_list). iopoll_node is never list_init-ed. So init before adding.
>
> Can you check if this fixes it for you? If yes, i will submit it as a proper
> patch - no way to test it at the moment.
The heads (anchors) of lists need initializations. The entries added
to the list do not. I know this is a bit confusing because they use
the same time, but besides not compiling due to the double-&, the patch
would not do anything even the version that would compile.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-14 5:58 ` [bug report][bisected] " Yi Zhang
2026-01-14 9:40 ` Alexander Atanasov
@ 2026-01-14 14:11 ` Ming Lei
2026-01-14 14:43 ` Jens Axboe
2026-01-16 11:54 ` Alexander Atanasov
1 sibling, 2 replies; 13+ messages in thread
From: Ming Lei @ 2026-01-14 14:11 UTC (permalink / raw)
To: Yi Zhang; +Cc: Jens Axboe, fengnanchang, linux-block, Shinichiro Kawasaki
On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote:
> On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> >
> > On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote:
> > >
> > > On 1/7/26 9:39 AM, Yi Zhang wrote:
> > > > Hi
> > > > The following issue[2] was triggered by blktests nvme/059 and it's
> > >
> > > nvme/049 presumably?
> > >
> > Yes.
> >
> > > > 100% reproduced with commit[1]. Please help check it and let me know
> > > > if you need any info/test for it.
> > > > Seems it's one regression, I will try to test with the latest
> > > > linux-block/for-next and also bisect it tomorrow.
> > >
> > > Doesn't reproduce for me on the current tree, but nothing since:
> > >
> > > > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
> > > > Merge: 29cefd61e0c6 fcf463b92a08
> > > > Author: Jens Axboe <axboe@kernel.dk>
> > > > Date: Tue Jan 6 05:48:07 2026 -0700
> > > >
> > > > Merge branch 'for-7.0/blk-pvec' into for-next
> > >
> > > should have impacted that. So please do bisect.
> >
> > Hi Jens
> > The issue seems was introduced from below commit.
> > and the issue cannot be reproduced after reverting this commit.
>
> The issue still can be reproduced on the latest linux-block/for-next
Hi Yi,
Can you try the following patch?
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index a9c097dacad6..7b0e62b8322b 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req,
pdu->result = le64_to_cpu(nvme_req(req)->result.u64);
/*
- * IOPOLL could potentially complete this request directly, but
- * if multiple rings are polling on the same queue, then it's possible
- * for one ring to find completions for another ring. Punting the
- * completion via task_work will always direct it to the right
- * location, rather than potentially complete requests for ringA
- * under iopoll invocations from ringB.
+ * For IOPOLL, complete the request inline. The request's io_kiocb
+ * uses a union for io_task_work and iopoll_node, so scheduling
+ * task_work would corrupt the iopoll_list while the request is
+ * still on it. io_uring_cmd_done() handles IOPOLL by setting
+ * iopoll_completed rather than scheduling task_work.
+ *
+ * For non-IOPOLL, complete via task_work to ensure we run in the
+ * submitter's context and handling multiple rings is safe.
*/
- io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
+ if (blk_rq_is_poll(req)) {
+ if (pdu->bio)
+ blk_rq_unmap_user(pdu->bio);
+ io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0);
+ } else {
+ io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
+ }
+
return RQ_END_IO_FREE;
}
Thanks,
Ming
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-14 14:11 ` Ming Lei
@ 2026-01-14 14:43 ` Jens Axboe
2026-01-14 14:58 ` Jens Axboe
2026-01-16 11:54 ` Alexander Atanasov
1 sibling, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2026-01-14 14:43 UTC (permalink / raw)
To: Ming Lei, Yi Zhang; +Cc: fengnanchang, linux-block, Shinichiro Kawasaki
On 1/14/26 7:11 AM, Ming Lei wrote:
> On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote:
>> On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote:
>>>
>>> On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> On 1/7/26 9:39 AM, Yi Zhang wrote:
>>>>> Hi
>>>>> The following issue[2] was triggered by blktests nvme/059 and it's
>>>>
>>>> nvme/049 presumably?
>>>>
>>> Yes.
>>>
>>>>> 100% reproduced with commit[1]. Please help check it and let me know
>>>>> if you need any info/test for it.
>>>>> Seems it's one regression, I will try to test with the latest
>>>>> linux-block/for-next and also bisect it tomorrow.
>>>>
>>>> Doesn't reproduce for me on the current tree, but nothing since:
>>>>
>>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
>>>>> Merge: 29cefd61e0c6 fcf463b92a08
>>>>> Author: Jens Axboe <axboe@kernel.dk>
>>>>> Date: Tue Jan 6 05:48:07 2026 -0700
>>>>>
>>>>> Merge branch 'for-7.0/blk-pvec' into for-next
>>>>
>>>> should have impacted that. So please do bisect.
>>>
>>> Hi Jens
>>> The issue seems was introduced from below commit.
>>> and the issue cannot be reproduced after reverting this commit.
>>
>> The issue still can be reproduced on the latest linux-block/for-next
>
> Hi Yi,
>
> Can you try the following patch?
>
>
> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
> index a9c097dacad6..7b0e62b8322b 100644
> --- a/drivers/nvme/host/ioctl.c
> +++ b/drivers/nvme/host/ioctl.c
> @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req,
> pdu->result = le64_to_cpu(nvme_req(req)->result.u64);
>
> /*
> - * IOPOLL could potentially complete this request directly, but
> - * if multiple rings are polling on the same queue, then it's possible
> - * for one ring to find completions for another ring. Punting the
> - * completion via task_work will always direct it to the right
> - * location, rather than potentially complete requests for ringA
> - * under iopoll invocations from ringB.
> + * For IOPOLL, complete the request inline. The request's io_kiocb
> + * uses a union for io_task_work and iopoll_node, so scheduling
> + * task_work would corrupt the iopoll_list while the request is
> + * still on it. io_uring_cmd_done() handles IOPOLL by setting
> + * iopoll_completed rather than scheduling task_work.
> + *
> + * For non-IOPOLL, complete via task_work to ensure we run in the
> + * submitter's context and handling multiple rings is safe.
> */
> - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> + if (blk_rq_is_poll(req)) {
> + if (pdu->bio)
> + blk_rq_unmap_user(pdu->bio);
> + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0);
> + } else {
> + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> + }
> +
> return RQ_END_IO_FREE;
> }
>
Ah yes that should fix it, the task_work addition will conflict with
the list addition. Don't think it's safe though, which is why I made
them all use task_work previously. Let me fix it in the IOPOLL patch
instead.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-14 14:43 ` Jens Axboe
@ 2026-01-14 14:58 ` Jens Axboe
2026-01-14 15:20 ` Ming Lei
0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2026-01-14 14:58 UTC (permalink / raw)
To: Ming Lei, Yi Zhang; +Cc: fengnanchang, linux-block, Shinichiro Kawasaki
On 1/14/26 7:43 AM, Jens Axboe wrote:
> On 1/14/26 7:11 AM, Ming Lei wrote:
>> On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote:
>>> On Thu, Jan 8, 2026 at 2:39?PM Yi Zhang <yi.zhang@redhat.com> wrote:
>>>>
>>>> On Thu, Jan 8, 2026 at 12:48?AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>
>>>>> On 1/7/26 9:39 AM, Yi Zhang wrote:
>>>>>> Hi
>>>>>> The following issue[2] was triggered by blktests nvme/059 and it's
>>>>>
>>>>> nvme/049 presumably?
>>>>>
>>>> Yes.
>>>>
>>>>>> 100% reproduced with commit[1]. Please help check it and let me know
>>>>>> if you need any info/test for it.
>>>>>> Seems it's one regression, I will try to test with the latest
>>>>>> linux-block/for-next and also bisect it tomorrow.
>>>>>
>>>>> Doesn't reproduce for me on the current tree, but nothing since:
>>>>>
>>>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
>>>>>> Merge: 29cefd61e0c6 fcf463b92a08
>>>>>> Author: Jens Axboe <axboe@kernel.dk>
>>>>>> Date: Tue Jan 6 05:48:07 2026 -0700
>>>>>>
>>>>>> Merge branch 'for-7.0/blk-pvec' into for-next
>>>>>
>>>>> should have impacted that. So please do bisect.
>>>>
>>>> Hi Jens
>>>> The issue seems was introduced from below commit.
>>>> and the issue cannot be reproduced after reverting this commit.
>>>
>>> The issue still can be reproduced on the latest linux-block/for-next
>>
>> Hi Yi,
>>
>> Can you try the following patch?
>>
>>
>> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
>> index a9c097dacad6..7b0e62b8322b 100644
>> --- a/drivers/nvme/host/ioctl.c
>> +++ b/drivers/nvme/host/ioctl.c
>> @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req,
>> pdu->result = le64_to_cpu(nvme_req(req)->result.u64);
>>
>> /*
>> - * IOPOLL could potentially complete this request directly, but
>> - * if multiple rings are polling on the same queue, then it's possible
>> - * for one ring to find completions for another ring. Punting the
>> - * completion via task_work will always direct it to the right
>> - * location, rather than potentially complete requests for ringA
>> - * under iopoll invocations from ringB.
>> + * For IOPOLL, complete the request inline. The request's io_kiocb
>> + * uses a union for io_task_work and iopoll_node, so scheduling
>> + * task_work would corrupt the iopoll_list while the request is
>> + * still on it. io_uring_cmd_done() handles IOPOLL by setting
>> + * iopoll_completed rather than scheduling task_work.
>> + *
>> + * For non-IOPOLL, complete via task_work to ensure we run in the
>> + * submitter's context and handling multiple rings is safe.
>> */
>> - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
>> + if (blk_rq_is_poll(req)) {
>> + if (pdu->bio)
>> + blk_rq_unmap_user(pdu->bio);
>> + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0);
>> + } else {
>> + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
>> + }
>> +
>> return RQ_END_IO_FREE;
>> }
>>
>
> Ah yes that should fix it, the task_work addition will conflict with
> the list addition. Don't think it's safe though, which is why I made
> them all use task_work previously. Let me fix it in the IOPOLL patch
> instead.
This should be better:
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index dd084a55bed8..1fa8d829cbac 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -719,13 +719,10 @@ struct io_kiocb {
atomic_t refs;
bool cancel_seq_set;
- /*
- * IOPOLL doesn't use task_work, so use the ->iopoll_node list
- * entry to manage pending iopoll requests.
- */
union {
struct io_task_work io_task_work;
- struct list_head iopoll_node;
+ /* For IOPOLL setup queues, with hybrid polling */
+ u64 iopoll_start;
};
union {
@@ -734,8 +731,8 @@ struct io_kiocb {
* poll
*/
struct hlist_node hash_node;
- /* For IOPOLL setup queues, with hybrid polling */
- u64 iopoll_start;
+ /* IOPOLL completion handling */
+ struct list_head iopoll_node;
/* for private io_kiocb freeing */
struct rcu_head rcu_head;
};
--
Jens Axboe
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-14 14:58 ` Jens Axboe
@ 2026-01-14 15:20 ` Ming Lei
2026-01-14 15:26 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2026-01-14 15:20 UTC (permalink / raw)
To: Jens Axboe; +Cc: Yi Zhang, fengnanchang, linux-block, Shinichiro Kawasaki
On Wed, Jan 14, 2026 at 07:58:54AM -0700, Jens Axboe wrote:
> On 1/14/26 7:43 AM, Jens Axboe wrote:
> > On 1/14/26 7:11 AM, Ming Lei wrote:
> >> On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote:
> >>> On Thu, Jan 8, 2026 at 2:39?PM Yi Zhang <yi.zhang@redhat.com> wrote:
> >>>>
> >>>> On Thu, Jan 8, 2026 at 12:48?AM Jens Axboe <axboe@kernel.dk> wrote:
> >>>>>
> >>>>> On 1/7/26 9:39 AM, Yi Zhang wrote:
> >>>>>> Hi
> >>>>>> The following issue[2] was triggered by blktests nvme/059 and it's
> >>>>>
> >>>>> nvme/049 presumably?
> >>>>>
> >>>> Yes.
> >>>>
> >>>>>> 100% reproduced with commit[1]. Please help check it and let me know
> >>>>>> if you need any info/test for it.
> >>>>>> Seems it's one regression, I will try to test with the latest
> >>>>>> linux-block/for-next and also bisect it tomorrow.
> >>>>>
> >>>>> Doesn't reproduce for me on the current tree, but nothing since:
> >>>>>
> >>>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
> >>>>>> Merge: 29cefd61e0c6 fcf463b92a08
> >>>>>> Author: Jens Axboe <axboe@kernel.dk>
> >>>>>> Date: Tue Jan 6 05:48:07 2026 -0700
> >>>>>>
> >>>>>> Merge branch 'for-7.0/blk-pvec' into for-next
> >>>>>
> >>>>> should have impacted that. So please do bisect.
> >>>>
> >>>> Hi Jens
> >>>> The issue seems was introduced from below commit.
> >>>> and the issue cannot be reproduced after reverting this commit.
> >>>
> >>> The issue still can be reproduced on the latest linux-block/for-next
> >>
> >> Hi Yi,
> >>
> >> Can you try the following patch?
> >>
> >>
> >> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
> >> index a9c097dacad6..7b0e62b8322b 100644
> >> --- a/drivers/nvme/host/ioctl.c
> >> +++ b/drivers/nvme/host/ioctl.c
> >> @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req,
> >> pdu->result = le64_to_cpu(nvme_req(req)->result.u64);
> >>
> >> /*
> >> - * IOPOLL could potentially complete this request directly, but
> >> - * if multiple rings are polling on the same queue, then it's possible
> >> - * for one ring to find completions for another ring. Punting the
> >> - * completion via task_work will always direct it to the right
> >> - * location, rather than potentially complete requests for ringA
> >> - * under iopoll invocations from ringB.
> >> + * For IOPOLL, complete the request inline. The request's io_kiocb
> >> + * uses a union for io_task_work and iopoll_node, so scheduling
> >> + * task_work would corrupt the iopoll_list while the request is
> >> + * still on it. io_uring_cmd_done() handles IOPOLL by setting
> >> + * iopoll_completed rather than scheduling task_work.
> >> + *
> >> + * For non-IOPOLL, complete via task_work to ensure we run in the
> >> + * submitter's context and handling multiple rings is safe.
> >> */
> >> - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> >> + if (blk_rq_is_poll(req)) {
> >> + if (pdu->bio)
> >> + blk_rq_unmap_user(pdu->bio);
> >> + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0);
> >> + } else {
> >> + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> >> + }
> >> +
> >> return RQ_END_IO_FREE;
> >> }
> >>
> >
> > Ah yes that should fix it, the task_work addition will conflict with
> > the list addition. Don't think it's safe though, which is why I made
> > them all use task_work previously. Let me fix it in the IOPOLL patch
> > instead.
>
> This should be better:
>
> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
> index dd084a55bed8..1fa8d829cbac 100644
> --- a/include/linux/io_uring_types.h
> +++ b/include/linux/io_uring_types.h
> @@ -719,13 +719,10 @@ struct io_kiocb {
> atomic_t refs;
> bool cancel_seq_set;
>
> - /*
> - * IOPOLL doesn't use task_work, so use the ->iopoll_node list
> - * entry to manage pending iopoll requests.
> - */
> union {
> struct io_task_work io_task_work;
> - struct list_head iopoll_node;
> + /* For IOPOLL setup queues, with hybrid polling */
> + u64 iopoll_start;
> };
>
> union {
> @@ -734,8 +731,8 @@ struct io_kiocb {
> * poll
> */
> struct hlist_node hash_node;
> - /* For IOPOLL setup queues, with hybrid polling */
> - u64 iopoll_start;
> + /* IOPOLL completion handling */
> + struct list_head iopoll_node;
> /* for private io_kiocb freeing */
> struct rcu_head rcu_head;
> };
This way looks better, just `req->iopoll_start` needs to read to local
variable first in io_uring_hybrid_poll().
Thanks,
Ming
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-14 15:20 ` Ming Lei
@ 2026-01-14 15:26 ` Jens Axboe
0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2026-01-14 15:26 UTC (permalink / raw)
To: Ming Lei; +Cc: Yi Zhang, fengnanchang, linux-block, Shinichiro Kawasaki
On 1/14/26 8:20 AM, Ming Lei wrote:
> On Wed, Jan 14, 2026 at 07:58:54AM -0700, Jens Axboe wrote:
>> On 1/14/26 7:43 AM, Jens Axboe wrote:
>>> On 1/14/26 7:11 AM, Ming Lei wrote:
>>>> On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote:
>>>>> On Thu, Jan 8, 2026 at 2:39?PM Yi Zhang <yi.zhang@redhat.com> wrote:
>>>>>>
>>>>>> On Thu, Jan 8, 2026 at 12:48?AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>>>
>>>>>>> On 1/7/26 9:39 AM, Yi Zhang wrote:
>>>>>>>> Hi
>>>>>>>> The following issue[2] was triggered by blktests nvme/059 and it's
>>>>>>>
>>>>>>> nvme/049 presumably?
>>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>>> 100% reproduced with commit[1]. Please help check it and let me know
>>>>>>>> if you need any info/test for it.
>>>>>>>> Seems it's one regression, I will try to test with the latest
>>>>>>>> linux-block/for-next and also bisect it tomorrow.
>>>>>>>
>>>>>>> Doesn't reproduce for me on the current tree, but nothing since:
>>>>>>>
>>>>>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
>>>>>>>> Merge: 29cefd61e0c6 fcf463b92a08
>>>>>>>> Author: Jens Axboe <axboe@kernel.dk>
>>>>>>>> Date: Tue Jan 6 05:48:07 2026 -0700
>>>>>>>>
>>>>>>>> Merge branch 'for-7.0/blk-pvec' into for-next
>>>>>>>
>>>>>>> should have impacted that. So please do bisect.
>>>>>>
>>>>>> Hi Jens
>>>>>> The issue seems was introduced from below commit.
>>>>>> and the issue cannot be reproduced after reverting this commit.
>>>>>
>>>>> The issue still can be reproduced on the latest linux-block/for-next
>>>>
>>>> Hi Yi,
>>>>
>>>> Can you try the following patch?
>>>>
>>>>
>>>> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
>>>> index a9c097dacad6..7b0e62b8322b 100644
>>>> --- a/drivers/nvme/host/ioctl.c
>>>> +++ b/drivers/nvme/host/ioctl.c
>>>> @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req,
>>>> pdu->result = le64_to_cpu(nvme_req(req)->result.u64);
>>>>
>>>> /*
>>>> - * IOPOLL could potentially complete this request directly, but
>>>> - * if multiple rings are polling on the same queue, then it's possible
>>>> - * for one ring to find completions for another ring. Punting the
>>>> - * completion via task_work will always direct it to the right
>>>> - * location, rather than potentially complete requests for ringA
>>>> - * under iopoll invocations from ringB.
>>>> + * For IOPOLL, complete the request inline. The request's io_kiocb
>>>> + * uses a union for io_task_work and iopoll_node, so scheduling
>>>> + * task_work would corrupt the iopoll_list while the request is
>>>> + * still on it. io_uring_cmd_done() handles IOPOLL by setting
>>>> + * iopoll_completed rather than scheduling task_work.
>>>> + *
>>>> + * For non-IOPOLL, complete via task_work to ensure we run in the
>>>> + * submitter's context and handling multiple rings is safe.
>>>> */
>>>> - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
>>>> + if (blk_rq_is_poll(req)) {
>>>> + if (pdu->bio)
>>>> + blk_rq_unmap_user(pdu->bio);
>>>> + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0);
>>>> + } else {
>>>> + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
>>>> + }
>>>> +
>>>> return RQ_END_IO_FREE;
>>>> }
>>>>
>>>
>>> Ah yes that should fix it, the task_work addition will conflict with
>>> the list addition. Don't think it's safe though, which is why I made
>>> them all use task_work previously. Let me fix it in the IOPOLL patch
>>> instead.
>>
>> This should be better:
>>
>> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
>> index dd084a55bed8..1fa8d829cbac 100644
>> --- a/include/linux/io_uring_types.h
>> +++ b/include/linux/io_uring_types.h
>> @@ -719,13 +719,10 @@ struct io_kiocb {
>> atomic_t refs;
>> bool cancel_seq_set;
>>
>> - /*
>> - * IOPOLL doesn't use task_work, so use the ->iopoll_node list
>> - * entry to manage pending iopoll requests.
>> - */
>> union {
>> struct io_task_work io_task_work;
>> - struct list_head iopoll_node;
>> + /* For IOPOLL setup queues, with hybrid polling */
>> + u64 iopoll_start;
>> };
>>
>> union {
>> @@ -734,8 +731,8 @@ struct io_kiocb {
>> * poll
>> */
>> struct hlist_node hash_node;
>> - /* For IOPOLL setup queues, with hybrid polling */
>> - u64 iopoll_start;
>> + /* IOPOLL completion handling */
>> + struct list_head iopoll_node;
>> /* for private io_kiocb freeing */
>> struct rcu_head rcu_head;
>> };
>
> This way looks better, just `req->iopoll_start` needs to read to local
> variable first in io_uring_hybrid_poll().
True, let me send out a v2.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-14 14:11 ` Ming Lei
2026-01-14 14:43 ` Jens Axboe
@ 2026-01-16 11:54 ` Alexander Atanasov
2026-01-16 12:41 ` Ming Lei
1 sibling, 1 reply; 13+ messages in thread
From: Alexander Atanasov @ 2026-01-16 11:54 UTC (permalink / raw)
To: Ming Lei, Yi Zhang
Cc: Jens Axboe, fengnanchang, linux-block, Shinichiro Kawasaki
Hello Ming,
On 14.01.26 16:11, Ming Lei wrote:
> On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote:
>> On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote:
>>>
>>> On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> On 1/7/26 9:39 AM, Yi Zhang wrote:
>>>>> Hi
>>>>> The following issue[2] was triggered by blktests nvme/059 and it's
>>>>
>>>> nvme/049 presumably?
>>>>
>>> Yes.
>>>
>>>>> 100% reproduced with commit[1]. Please help check it and let me know
>>>>> if you need any info/test for it.
>>>>> Seems it's one regression, I will try to test with the latest
>>>>> linux-block/for-next and also bisect it tomorrow.
>>>>
>>>> Doesn't reproduce for me on the current tree, but nothing since:
>>>>
>>>>> commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
>>>>> Merge: 29cefd61e0c6 fcf463b92a08
>>>>> Author: Jens Axboe <axboe@kernel.dk>
>>>>> Date: Tue Jan 6 05:48:07 2026 -0700
>>>>>
>>>>> Merge branch 'for-7.0/blk-pvec' into for-next
>>>>
>>>> should have impacted that. So please do bisect.
>>>
>>> Hi Jens
>>> The issue seems was introduced from below commit.
>>> and the issue cannot be reproduced after reverting this commit.
>>
>> The issue still can be reproduced on the latest linux-block/for-next
>
> Hi Yi,
>
> Can you try the following patch?
>
>
> diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
> index a9c097dacad6..7b0e62b8322b 100644
> --- a/drivers/nvme/host/ioctl.c
> +++ b/drivers/nvme/host/ioctl.c
> @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req,
> pdu->result = le64_to_cpu(nvme_req(req)->result.u64);
>
> /*
> - * IOPOLL could potentially complete this request directly, but
> - * if multiple rings are polling on the same queue, then it's possible
> - * for one ring to find completions for another ring. Punting the
> - * completion via task_work will always direct it to the right
> - * location, rather than potentially complete requests for ringA
> - * under iopoll invocations from ringB.
> + * For IOPOLL, complete the request inline. The request's io_kiocb
> + * uses a union for io_task_work and iopoll_node, so scheduling
> + * task_work would corrupt the iopoll_list while the request is
> + * still on it. io_uring_cmd_done() handles IOPOLL by setting
> + * iopoll_completed rather than scheduling task_work.
> + *
> + * For non-IOPOLL, complete via task_work to ensure we run in the
> + * submitter's context and handling multiple rings is safe.
> */
> - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> + if (blk_rq_is_poll(req)) {
> + if (pdu->bio)
> + blk_rq_unmap_user(pdu->bio);
> + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0);
> + } else {
> + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> + }
> +
> return RQ_END_IO_FREE;
> }
While this is a good optimisation and it will fix the list issue for a
single user - it may crash with multiple users of the context. I am
still learning this code, so excuse my ignorance here and there.
The bisected patch 3c7d76d6128a changed io_wq_work_list which looks like
safe to be used without locks (it is a derivate of llist) , list_head
require proper locking to be safe.
ctx can be used to poll multiple files, iopoll_list is a list for that
reason.
sqpoll is calling io_iopoll_req_issued without lock -> it does
list_add_tail
if that races with other list addition or deletion it will corrupt the list.
is there any mechanism to prevent that? or i am missing something?
--
have fun,
alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bug report][bisected] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049
2026-01-16 11:54 ` Alexander Atanasov
@ 2026-01-16 12:41 ` Ming Lei
0 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2026-01-16 12:41 UTC (permalink / raw)
To: alex+zkern
Cc: Yi Zhang, Jens Axboe, fengnanchang, linux-block,
Shinichiro Kawasaki
On Fri, Jan 16, 2026 at 01:54:15PM +0200, Alexander Atanasov wrote:
> Hello Ming,
>
> On 14.01.26 16:11, Ming Lei wrote:
> > On Wed, Jan 14, 2026 at 01:58:03PM +0800, Yi Zhang wrote:
> > > On Thu, Jan 8, 2026 at 2:39 PM Yi Zhang <yi.zhang@redhat.com> wrote:
> > > >
> > > > On Thu, Jan 8, 2026 at 12:48 AM Jens Axboe <axboe@kernel.dk> wrote:
> > > > >
> > > > > On 1/7/26 9:39 AM, Yi Zhang wrote:
> > > > > > Hi
> > > > > > The following issue[2] was triggered by blktests nvme/059 and it's
> > > > >
> > > > > nvme/049 presumably?
> > > > >
> > > > Yes.
> > > >
> > > > > > 100% reproduced with commit[1]. Please help check it and let me know
> > > > > > if you need any info/test for it.
> > > > > > Seems it's one regression, I will try to test with the latest
> > > > > > linux-block/for-next and also bisect it tomorrow.
> > > > >
> > > > > Doesn't reproduce for me on the current tree, but nothing since:
> > > > >
> > > > > > commit 5ee81d4ae52ec4e9206efb4c1b06e269407aba11
> > > > > > Merge: 29cefd61e0c6 fcf463b92a08
> > > > > > Author: Jens Axboe <axboe@kernel.dk>
> > > > > > Date: Tue Jan 6 05:48:07 2026 -0700
> > > > > >
> > > > > > Merge branch 'for-7.0/blk-pvec' into for-next
> > > > >
> > > > > should have impacted that. So please do bisect.
> > > >
> > > > Hi Jens
> > > > The issue seems was introduced from below commit.
> > > > and the issue cannot be reproduced after reverting this commit.
> > >
> > > The issue still can be reproduced on the latest linux-block/for-next
> >
> > Hi Yi,
> >
> > Can you try the following patch?
> >
> >
> > diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
> > index a9c097dacad6..7b0e62b8322b 100644
> > --- a/drivers/nvme/host/ioctl.c
> > +++ b/drivers/nvme/host/ioctl.c
> > @@ -425,14 +425,23 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req,
> > pdu->result = le64_to_cpu(nvme_req(req)->result.u64);
> > /*
> > - * IOPOLL could potentially complete this request directly, but
> > - * if multiple rings are polling on the same queue, then it's possible
> > - * for one ring to find completions for another ring. Punting the
> > - * completion via task_work will always direct it to the right
> > - * location, rather than potentially complete requests for ringA
> > - * under iopoll invocations from ringB.
> > + * For IOPOLL, complete the request inline. The request's io_kiocb
> > + * uses a union for io_task_work and iopoll_node, so scheduling
> > + * task_work would corrupt the iopoll_list while the request is
> > + * still on it. io_uring_cmd_done() handles IOPOLL by setting
> > + * iopoll_completed rather than scheduling task_work.
> > + *
> > + * For non-IOPOLL, complete via task_work to ensure we run in the
> > + * submitter's context and handling multiple rings is safe.
> > */
> > - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> > + if (blk_rq_is_poll(req)) {
> > + if (pdu->bio)
> > + blk_rq_unmap_user(pdu->bio);
> > + io_uring_cmd_done32(ioucmd, pdu->status, pdu->result, 0);
> > + } else {
> > + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb);
> > + }
> > +
> > return RQ_END_IO_FREE;
> > }
>
>
> While this is a good optimisation and it will fix the list issue for a
> single user - it may crash with multiple users of the context. I am still
> learning this code, so excuse my ignorance here and there.
Jens has sent the following fix already:
https://lore.kernel.org/io-uring/aWhGEMsaOf752f5z@fedora/T/#t
>
> The bisected patch 3c7d76d6128a changed io_wq_work_list which looks like
> safe to be used without locks (it is a derivate of llist) , list_head
> require proper locking to be safe.
>
> ctx can be used to poll multiple files, iopoll_list is a list for that
> reason.
> sqpoll is calling io_iopoll_req_issued without lock -> it does list_add_tail
> if that races with other list addition or deletion it will corrupt the list.
>
> is there any mechanism to prevent that? or i am missing something?
io_iopoll_req_issued() will grab ctx->uring_lock if it isn't held.
Thanks,
Ming
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-01-16 12:41 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-07 16:39 [bug report] kernel BUG at lib/list_debug.c:32! triggered by blktests nvme/049 Yi Zhang
2026-01-07 16:48 ` Jens Axboe
2026-01-08 6:39 ` Yi Zhang
2026-01-14 5:58 ` [bug report][bisected] " Yi Zhang
2026-01-14 9:40 ` Alexander Atanasov
2026-01-14 12:43 ` Christoph Hellwig
2026-01-14 14:11 ` Ming Lei
2026-01-14 14:43 ` Jens Axboe
2026-01-14 14:58 ` Jens Axboe
2026-01-14 15:20 ` Ming Lei
2026-01-14 15:26 ` Jens Axboe
2026-01-16 11:54 ` Alexander Atanasov
2026-01-16 12:41 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox