* [syzbot] [smc?] KCSAN: data-race in smc_switch_to_fallback / sock_poll (2)
@ 2026-02-06 6:16 syzbot
2026-02-06 6:30 ` Dmitry Vyukov
0 siblings, 1 reply; 2+ messages in thread
From: syzbot @ 2026-02-06 6:16 UTC (permalink / raw)
To: alibuda, davem, dust.li, edumazet, guwen, horms, kuba,
linux-kernel, linux-rdma, linux-s390, mjambigi, netdev, pabeni,
sidraya, syzkaller-bugs, tonylu, wenjia
Hello,
syzbot found the following issue on:
HEAD commit: 5fd0a1df5d05 Merge tag 'v6.19rc8-smb3-client-fixes' of git..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1070aa5a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=8e27f4588a0f2183
dashboard link: https://syzkaller.appspot.com/bug?extid=198c20fde37cb9f6b0ac
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/a09cd69509c3/disk-5fd0a1df.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/f218ec1eb157/vmlinux-5fd0a1df.xz
kernel image: https://storage.googleapis.com/syzbot-assets/8549229eee91/bzImage-5fd0a1df.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+198c20fde37cb9f6b0ac@syzkaller.appspotmail.com
==================================================================
BUG: KCSAN: data-race in smc_switch_to_fallback / sock_poll
write to 0xffff888127398c18 of 8 bytes by task 14369 on cpu 1:
smc_switch_to_fallback+0x4ea/0x7e0 net/smc/af_smc.c:933
smc_sendmsg+0xce/0x2f0 net/smc/af_smc.c:2797
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
____sys_sendmsg+0x5af/0x600 net/socket.c:2592
___sys_sendmsg+0x195/0x1e0 net/socket.c:2646
__sys_sendmsg net/socket.c:2678 [inline]
__do_sys_sendmsg net/socket.c:2683 [inline]
__se_sys_sendmsg net/socket.c:2681 [inline]
__x64_sys_sendmsg+0xd4/0x160 net/socket.c:2681
x64_sys_call+0x17ba/0x3000 arch/x86/include/generated/asm/syscalls_64.h:47
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffff888127398c18 of 8 bytes by task 14367 on cpu 0:
sock_poll+0x27/0x240 net/socket.c:1427
vfs_poll include/linux/poll.h:82 [inline]
__io_arm_poll_handler+0x1ee/0xb80 io_uring/poll.c:581
io_poll_add+0x69/0xf0 io_uring/poll.c:899
__io_issue_sqe+0xfd/0x2d0 io_uring/io_uring.c:1793
io_issue_sqe+0x20b/0xc20 io_uring/io_uring.c:1816
io_queue_sqe io_uring/io_uring.c:2043 [inline]
io_submit_sqe io_uring/io_uring.c:2321 [inline]
io_submit_sqes+0x78a/0x11b0 io_uring/io_uring.c:2435
__do_sys_io_uring_enter io_uring/io_uring.c:3285 [inline]
__se_sys_io_uring_enter+0x1bf/0x1c70 io_uring/io_uring.c:3224
__x64_sys_io_uring_enter+0x78/0x90 io_uring/io_uring.c:3224
x64_sys_call+0x27e4/0x3000 arch/x86/include/generated/asm/syscalls_64.h:427
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0xffff88811a6056c0 -> 0xffff88811a606080
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 14367 Comm: syz.8.3658 Not tainted syzkaller #0 PREEMPT(voluntary)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [syzbot] [smc?] KCSAN: data-race in smc_switch_to_fallback / sock_poll (2)
2026-02-06 6:16 [syzbot] [smc?] KCSAN: data-race in smc_switch_to_fallback / sock_poll (2) syzbot
@ 2026-02-06 6:30 ` Dmitry Vyukov
0 siblings, 0 replies; 2+ messages in thread
From: Dmitry Vyukov @ 2026-02-06 6:30 UTC (permalink / raw)
To: syzbot
Cc: alibuda, davem, dust.li, edumazet, guwen, horms, kuba,
linux-kernel, linux-rdma, linux-s390, mjambigi, netdev, pabeni,
sidraya, syzkaller-bugs, tonylu, wenjia
On Fri, 6 Feb 2026 at 07:16, syzbot
<syzbot+198c20fde37cb9f6b0ac@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 5fd0a1df5d05 Merge tag 'v6.19rc8-smb3-client-fixes' of git..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1070aa5a580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=8e27f4588a0f2183
> dashboard link: https://syzkaller.appspot.com/bug?extid=198c20fde37cb9f6b0ac
> compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/a09cd69509c3/disk-5fd0a1df.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/f218ec1eb157/vmlinux-5fd0a1df.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/8549229eee91/bzImage-5fd0a1df.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+198c20fde37cb9f6b0ac@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KCSAN: data-race in smc_switch_to_fallback / sock_poll
>
> write to 0xffff888127398c18 of 8 bytes by task 14369 on cpu 1:
> smc_switch_to_fallback+0x4ea/0x7e0 net/smc/af_smc.c:933
> smc_sendmsg+0xce/0x2f0 net/smc/af_smc.c:2797
> sock_sendmsg_nosec net/socket.c:727 [inline]
> __sock_sendmsg net/socket.c:742 [inline]
> ____sys_sendmsg+0x5af/0x600 net/socket.c:2592
> ___sys_sendmsg+0x195/0x1e0 net/socket.c:2646
> __sys_sendmsg net/socket.c:2678 [inline]
> __do_sys_sendmsg net/socket.c:2683 [inline]
> __se_sys_sendmsg net/socket.c:2681 [inline]
> __x64_sys_sendmsg+0xd4/0x160 net/socket.c:2681
> x64_sys_call+0x17ba/0x3000 arch/x86/include/generated/asm/syscalls_64.h:47
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> read to 0xffff888127398c18 of 8 bytes by task 14367 on cpu 0:
> sock_poll+0x27/0x240 net/socket.c:1427
> vfs_poll include/linux/poll.h:82 [inline]
> __io_arm_poll_handler+0x1ee/0xb80 io_uring/poll.c:581
> io_poll_add+0x69/0xf0 io_uring/poll.c:899
> __io_issue_sqe+0xfd/0x2d0 io_uring/io_uring.c:1793
> io_issue_sqe+0x20b/0xc20 io_uring/io_uring.c:1816
> io_queue_sqe io_uring/io_uring.c:2043 [inline]
> io_submit_sqe io_uring/io_uring.c:2321 [inline]
> io_submit_sqes+0x78a/0x11b0 io_uring/io_uring.c:2435
> __do_sys_io_uring_enter io_uring/io_uring.c:3285 [inline]
> __se_sys_io_uring_enter+0x1bf/0x1c70 io_uring/io_uring.c:3224
> __x64_sys_io_uring_enter+0x78/0x90 io_uring/io_uring.c:3224
> x64_sys_call+0x27e4/0x3000 arch/x86/include/generated/asm/syscalls_64.h:427
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> value changed: 0xffff88811a6056c0 -> 0xffff88811a606080
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 UID: 0 PID: 14367 Comm: syz.8.3658 Not tainted syzkaller #0 PREEMPT(voluntary)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
> ==================================================================
Here is what LLM said re harmfull-ness of this data race.
It does not look totally bogus to me. At least the read of
file->private_data in sock_poll() needs to be done with READ_ONCE to
avoid the harmful scenario. I don't know if changing the fundamental
socket function because of this it's the best solution, though.
========
The data race occurs on the `file->private_data` field of a socket
file descriptor. This field is being updated in
`smc_switch_to_fallback()` (to point to the underlying TCP/CLC socket
instead of the SMC socket) while concurrently being read in
`sock_poll()`.
### Analysis of the Race
1. **Nature of the Access**: `smc_switch_to_fallback()` is performing
a plain write to `file->private_data`, and `sock_poll()` is performing
a plain read. There is no mutual exclusion (like a lock) or memory
barrier protecting this transition.
2. **Type Confusion Risk**: In `sock_poll()`, the code first reads
`file->private_data` into a local variable `sock`, then reads
`sock->ops`, and finally calls `ops->poll(file, sock, wait)`. If the
compiler reloads `sock` from `file->private_data` between these steps
(which is permitted under the C memory model for non-volatile
accesses), it could fetch the `ops` from the SMC socket but then call
that `ops->poll` function (i.e., `smc_poll`) passing the TCP socket as
the `sock` argument.
3. **Consequences of Type Confusion**: `smc_poll()` casts the `struct
socket *sock` to a `struct smc_sock *`. A TCP socket (`struct
tcp_sock`) is not compatible with `struct smc_sock`. Accessing
SMC-specific fields (like `smc->use_fallback` or `smc->conn`) on a TCP
socket object would result in reading random memory, leading to
undefined behavior, logic errors, or a kernel crash.
4. **Inconsistent State**: Even if the compiler does not reload the
pointer, the race between setting `smc->use_fallback = true` and
updating `file->private_data` means that `sock_poll()` might see an
inconsistent state where it enters `smc_poll()` but the fallback is
already partially complete, potentially accessing uninitialized or
transitioning connection state.
5. **Violation of Invariants**: In the Linux kernel,
`file->private_data` for a socket is generally expected to be constant
for the lifetime of the `file` object. SMC's "fallback" mechanism
violates this invariant. While the mechanism is intended to be a
performance optimization, doing so without proper synchronization
(like `READ_ONCE`/`WRITE_ONCE` or a lock) makes it unsafe.
### Conclusion
This data race is **harmful** because it can lead to type confusion
and memory corruption. It is not a simple statistics counter or a
benign flag race; it involves the fundamental identity of the socket
object being operated on.
The fix for this would typically involve using `READ_ONCE` and
`WRITE_ONCE` to prevent compiler reloads and ensure atomicity, or
better yet, avoiding the mid-flight change of `file->private_data`
altogether.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-02-06 6:31 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-06 6:16 [syzbot] [smc?] KCSAN: data-race in smc_switch_to_fallback / sock_poll (2) syzbot
2026-02-06 6:30 ` Dmitry Vyukov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox