* [syzbot] [rdma?] WARNING in ib_dealloc_device
@ 2026-04-13 0:04 syzbot
2026-04-13 15:43 ` Leon Romanovsky
0 siblings, 1 reply; 3+ messages in thread
From: syzbot @ 2026-04-13 0:04 UTC (permalink / raw)
To: jgg, leon, linux-kernel, linux-rdma, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 7f87a5ea75f0 Merge tag 'hid-for-linus-2026040801' of git:/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11778eba580000
kernel config: https://syzkaller.appspot.com/x/.config?x=45cb3c58fd963c27
dashboard link: https://syzkaller.appspot.com/bug?extid=03393ff6c35fd2cc43de
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0f5deca1373e/disk-7f87a5ea.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/6aea6c1c6b6e/vmlinux-7f87a5ea.xz
kernel image: https://storage.googleapis.com/syzbot-assets/61444b289e96/bzImage-7f87a5ea.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+03393ff6c35fd2cc43de@syzkaller.appspotmail.com
------------[ cut here ]------------
!xa_empty(&device->compat_devs)
WARNING: drivers/infiniband/core/device.c:682 at ib_dealloc_device+0x187/0x200 drivers/infiniband/core/device.c:682, CPU#0: kworker/u8:37/4856
Modules linked in:
CPU: 0 UID: 0 PID: 4856 Comm: kworker/u8:37 Tainted: G L syzkaller #0 PREEMPT_{RT,(full)}
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/18/2026
Workqueue: ib-unreg-wq ib_unregister_work
RIP: 0010:ib_dealloc_device+0x187/0x200 drivers/infiniband/core/device.c:682
Code: e8 de ec ad f9 48 89 df e8 56 59 07 00 48 81 c3 30 08 00 00 48 89 df 5b 41 5c 41 5e 41 5f 5d e9 0f 09 60 fd e8 ba ec ad f9 90 <0f> 0b 90 e9 72 ff ff ff e8 ac ec ad f9 90 0f 0b 90 eb 8f e8 a1 ec
RSP: 0018:ffffc9000f49fa18 EFLAGS: 00010293
RAX: ffffffff88169536 RBX: ffff888039d40000 RCX: ffff88806a691e80
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888039d41308 R08: 0000000000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: fffffbfff1ed4eb7 R12: 1ffff110073a81fd
R13: dffffc0000000000 R14: ffff888039d41268 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff888126332000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff6d2897e9c CR3: 0000000022382000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000f1ffffdf
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__ib_unregister_device+0x393/0x3f0 drivers/infiniband/core/device.c:1545
ib_unregister_work+0x19/0x30 drivers/infiniband/core/device.c:1639
process_one_work kernel/workqueue.c:3276 [inline]
process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [syzbot] [rdma?] WARNING in ib_dealloc_device
2026-04-13 0:04 [syzbot] [rdma?] WARNING in ib_dealloc_device syzbot
@ 2026-04-13 15:43 ` Leon Romanovsky
[not found] ` <PH7PR12MB66356E0176748BFFF081D9B4B0242@PH7PR12MB6635.namprd12.prod.outlook.com>
0 siblings, 1 reply; 3+ messages in thread
From: Leon Romanovsky @ 2026-04-13 15:43 UTC (permalink / raw)
To: syzbot; +Cc: jgg, linux-kernel, linux-rdma, syzkaller-bugs, Jiri Pirko
On Sun, Apr 12, 2026 at 05:04:32PM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 7f87a5ea75f0 Merge tag 'hid-for-linus-2026040801' of git:/..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11778eba580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=45cb3c58fd963c27
> dashboard link: https://syzkaller.appspot.com/bug?extid=03393ff6c35fd2cc43de
> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/0f5deca1373e/disk-7f87a5ea.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/6aea6c1c6b6e/vmlinux-7f87a5ea.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/61444b289e96/bzImage-7f87a5ea.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+03393ff6c35fd2cc43de@syzkaller.appspotmail.com
>
> ------------[ cut here ]------------
> !xa_empty(&device->compat_devs)
> WARNING: drivers/infiniband/core/device.c:682 at ib_dealloc_device+0x187/0x200 drivers/infiniband/core/device.c:682, CPU#0: kworker/u8:37/4856
I think that we have only one patch in this area https://patch.msgid.link/20260127093839.126291-1-jiri@resnulli.us
Thanks
> Modules linked in:
> CPU: 0 UID: 0 PID: 4856 Comm: kworker/u8:37 Tainted: G L syzkaller #0 PREEMPT_{RT,(full)}
> Tainted: [L]=SOFTLOCKUP
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/18/2026
> Workqueue: ib-unreg-wq ib_unregister_work
> RIP: 0010:ib_dealloc_device+0x187/0x200 drivers/infiniband/core/device.c:682
> Code: e8 de ec ad f9 48 89 df e8 56 59 07 00 48 81 c3 30 08 00 00 48 89 df 5b 41 5c 41 5e 41 5f 5d e9 0f 09 60 fd e8 ba ec ad f9 90 <0f> 0b 90 e9 72 ff ff ff e8 ac ec ad f9 90 0f 0b 90 eb 8f e8 a1 ec
> RSP: 0018:ffffc9000f49fa18 EFLAGS: 00010293
> RAX: ffffffff88169536 RBX: ffff888039d40000 RCX: ffff88806a691e80
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffff888039d41308 R08: 0000000000000000 R09: 0000000000000000
> R10: dffffc0000000000 R11: fffffbfff1ed4eb7 R12: 1ffff110073a81fd
> R13: dffffc0000000000 R14: ffff888039d41268 R15: dffffc0000000000
> FS: 0000000000000000(0000) GS:ffff888126332000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ff6d2897e9c CR3: 0000000022382000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000f1ffffdf
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> __ib_unregister_device+0x393/0x3f0 drivers/infiniband/core/device.c:1545
> ib_unregister_work+0x19/0x30 drivers/infiniband/core/device.c:1639
> process_one_work kernel/workqueue.c:3276 [inline]
> process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
> worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
> kthread+0x388/0x470 kernel/kthread.c:436
> ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [syzbot] [rdma?] WARNING in ib_dealloc_device
[not found] ` <PH7PR12MB66356E0176748BFFF081D9B4B0242@PH7PR12MB6635.namprd12.prod.outlook.com>
@ 2026-04-13 17:42 ` Jason Gunthorpe
0 siblings, 0 replies; 3+ messages in thread
From: Jason Gunthorpe @ 2026-04-13 17:42 UTC (permalink / raw)
To: Jiri Pirko
Cc: Leon Romanovsky, syzbot, linux-kernel@vger.kernel.org,
linux-rdma@vger.kernel.org, syzkaller-bugs@googlegroups.com
On Mon, Apr 13, 2026 at 04:12:09PM +0000, Jiri Pirko wrote:
> Will check it tmrw
I fed it to Claude and after 40 mins it is stumped too.. It should not
be possible for this to happen.
__ib_unregister_device() always calls down to disable_device()
Which always removes it from all visibility, drives the refcount to 0
and then cleans the xarray:
xa_for_each (&device->compat_devs, index, cdev)
remove_one_compat_dev(device, index);
Then ib_dealloc_device() checks it is empty:
WARN_ON(!xa_empty(&device->compat_devs));
At the point the xa_for_each is run there should be no cocurrent
threads that can see the device. The refcount is zero, it was removed
from the xarray. The add_one_compat_dev() is never called in an
condition that could see a stray device.
It should not be possible for the compat_devs of a 0 refcount
ib_device removed from the device's xarray to be mutated between those
two checks.
One notable thing about xarray is you can have a xa_for_each() iterate
over nothing and also have xa_empty() be false. Maybe that is
happening here, but I could not find any way that should happen.
I guess just keep watching this and see if it happens ever again. Add
some debugging to print out the xarray. Maybe the way we are using
xarray is unexpectedly triggering a stray 0 entry?
Jason
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 4c174f7f1070cb..592e29b0cccf39 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -1020,6 +1020,71 @@ static void remove_compat_devs(struct ib_device *device)
xa_for_each (&device->compat_devs, index, cdev)
remove_one_compat_dev(device, index);
+
+ if (!xa_empty(&device->compat_devs)) {
+ struct xa_node *node;
+ void *head;
+ unsigned int i;
+
+ dev_warn(&device->dev,
+ "compat_devs xarray not empty after removal!\n");
+
+ xa_lock(&device->compat_devs);
+ head = xa_head_locked(&device->compat_devs);
+ dev_warn(&device->dev, " xa_head=%px xa_flags=%x\n",
+ head, device->compat_devs.xa_flags);
+
+ if (!xa_is_node(head)) {
+ /* Single entry at index 0 stored directly in head */
+ if (xa_is_zero(head))
+ dev_warn(&device->dev,
+ " head[0]: zero entry (leaked xa_reserve)\n");
+ else if (!xa_is_internal(head))
+ dev_warn(&device->dev,
+ " head[0]: pointer %px\n", head);
+ else
+ dev_warn(&device->dev,
+ " head[0]: internal %px (%lu)\n",
+ head, xa_to_internal(head));
+ } else {
+ node = xa_to_node(head);
+ dev_warn(&device->dev,
+ " node %px shift %d count %d nr_values %d\n",
+ node, node->shift, node->count,
+ node->nr_values);
+ for (i = 0; i < XA_CHUNK_SIZE; i++) {
+ void *entry = xa_entry_locked(
+ &device->compat_devs, node, i);
+
+ if (!entry)
+ continue;
+ if (xa_is_zero(entry))
+ dev_warn(&device->dev,
+ " slot[%u]: zero entry (leaked xa_reserve)\n",
+ i);
+ else if (xa_is_sibling(entry))
+ dev_warn(&device->dev,
+ " slot[%u]: sibling -> slot %lu\n",
+ i, xa_to_sibling(entry));
+ else if (xa_is_retry(entry))
+ dev_warn(&device->dev,
+ " slot[%u]: retry\n", i);
+ else if (xa_is_node(entry))
+ dev_warn(&device->dev,
+ " slot[%u]: node %px (deeper tree)\n",
+ i, xa_to_node(entry));
+ else if (!xa_is_internal(entry))
+ dev_warn(&device->dev,
+ " slot[%u]: pointer %px\n",
+ i, entry);
+ else
+ dev_warn(&device->dev,
+ " slot[%u]: unknown internal %px\n",
+ i, entry);
+ }
+ }
+ xa_unlock(&device->compat_devs);
+ }
}
static int add_compat_devs(struct ib_device *device)
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-13 17:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 0:04 [syzbot] [rdma?] WARNING in ib_dealloc_device syzbot
2026-04-13 15:43 ` Leon Romanovsky
[not found] ` <PH7PR12MB66356E0176748BFFF081D9B4B0242@PH7PR12MB6635.namprd12.prod.outlook.com>
2026-04-13 17:42 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox