public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [smc?] KCSAN: data-race in smc_switch_to_fallback / sock_poll (2)
@ 2026-02-06  6:16 syzbot
  2026-02-06  6:30 ` Dmitry Vyukov
  0 siblings, 1 reply; 2+ messages in thread
From: syzbot @ 2026-02-06  6:16 UTC (permalink / raw)
  To: alibuda, davem, dust.li, edumazet, guwen, horms, kuba,
	linux-kernel, linux-rdma, linux-s390, mjambigi, netdev, pabeni,
	sidraya, syzkaller-bugs, tonylu, wenjia

Hello,

syzbot found the following issue on:

HEAD commit:    5fd0a1df5d05 Merge tag 'v6.19rc8-smb3-client-fixes' of git..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1070aa5a580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8e27f4588a0f2183
dashboard link: https://syzkaller.appspot.com/bug?extid=198c20fde37cb9f6b0ac
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/a09cd69509c3/disk-5fd0a1df.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/f218ec1eb157/vmlinux-5fd0a1df.xz
kernel image: https://storage.googleapis.com/syzbot-assets/8549229eee91/bzImage-5fd0a1df.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+198c20fde37cb9f6b0ac@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in smc_switch_to_fallback / sock_poll

write to 0xffff888127398c18 of 8 bytes by task 14369 on cpu 1:
 smc_switch_to_fallback+0x4ea/0x7e0 net/smc/af_smc.c:933
 smc_sendmsg+0xce/0x2f0 net/smc/af_smc.c:2797
 sock_sendmsg_nosec net/socket.c:727 [inline]
 __sock_sendmsg net/socket.c:742 [inline]
 ____sys_sendmsg+0x5af/0x600 net/socket.c:2592
 ___sys_sendmsg+0x195/0x1e0 net/socket.c:2646
 __sys_sendmsg net/socket.c:2678 [inline]
 __do_sys_sendmsg net/socket.c:2683 [inline]
 __se_sys_sendmsg net/socket.c:2681 [inline]
 __x64_sys_sendmsg+0xd4/0x160 net/socket.c:2681
 x64_sys_call+0x17ba/0x3000 arch/x86/include/generated/asm/syscalls_64.h:47
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff888127398c18 of 8 bytes by task 14367 on cpu 0:
 sock_poll+0x27/0x240 net/socket.c:1427
 vfs_poll include/linux/poll.h:82 [inline]
 __io_arm_poll_handler+0x1ee/0xb80 io_uring/poll.c:581
 io_poll_add+0x69/0xf0 io_uring/poll.c:899
 __io_issue_sqe+0xfd/0x2d0 io_uring/io_uring.c:1793
 io_issue_sqe+0x20b/0xc20 io_uring/io_uring.c:1816
 io_queue_sqe io_uring/io_uring.c:2043 [inline]
 io_submit_sqe io_uring/io_uring.c:2321 [inline]
 io_submit_sqes+0x78a/0x11b0 io_uring/io_uring.c:2435
 __do_sys_io_uring_enter io_uring/io_uring.c:3285 [inline]
 __se_sys_io_uring_enter+0x1bf/0x1c70 io_uring/io_uring.c:3224
 __x64_sys_io_uring_enter+0x78/0x90 io_uring/io_uring.c:3224
 x64_sys_call+0x27e4/0x3000 arch/x86/include/generated/asm/syscalls_64.h:427
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0xffff88811a6056c0 -> 0xffff88811a606080

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 14367 Comm: syz.8.3658 Not tainted syzkaller #0 PREEMPT(voluntary) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [syzbot] [smc?] KCSAN: data-race in smc_switch_to_fallback / sock_poll (2)
  2026-02-06  6:16 [syzbot] [smc?] KCSAN: data-race in smc_switch_to_fallback / sock_poll (2) syzbot
@ 2026-02-06  6:30 ` Dmitry Vyukov
  0 siblings, 0 replies; 2+ messages in thread
From: Dmitry Vyukov @ 2026-02-06  6:30 UTC (permalink / raw)
  To: syzbot
  Cc: alibuda, davem, dust.li, edumazet, guwen, horms, kuba,
	linux-kernel, linux-rdma, linux-s390, mjambigi, netdev, pabeni,
	sidraya, syzkaller-bugs, tonylu, wenjia

On Fri, 6 Feb 2026 at 07:16, syzbot
<syzbot+198c20fde37cb9f6b0ac@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    5fd0a1df5d05 Merge tag 'v6.19rc8-smb3-client-fixes' of git..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1070aa5a580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8e27f4588a0f2183
> dashboard link: https://syzkaller.appspot.com/bug?extid=198c20fde37cb9f6b0ac
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/a09cd69509c3/disk-5fd0a1df.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/f218ec1eb157/vmlinux-5fd0a1df.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/8549229eee91/bzImage-5fd0a1df.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+198c20fde37cb9f6b0ac@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KCSAN: data-race in smc_switch_to_fallback / sock_poll
>
> write to 0xffff888127398c18 of 8 bytes by task 14369 on cpu 1:
>  smc_switch_to_fallback+0x4ea/0x7e0 net/smc/af_smc.c:933
>  smc_sendmsg+0xce/0x2f0 net/smc/af_smc.c:2797
>  sock_sendmsg_nosec net/socket.c:727 [inline]
>  __sock_sendmsg net/socket.c:742 [inline]
>  ____sys_sendmsg+0x5af/0x600 net/socket.c:2592
>  ___sys_sendmsg+0x195/0x1e0 net/socket.c:2646
>  __sys_sendmsg net/socket.c:2678 [inline]
>  __do_sys_sendmsg net/socket.c:2683 [inline]
>  __se_sys_sendmsg net/socket.c:2681 [inline]
>  __x64_sys_sendmsg+0xd4/0x160 net/socket.c:2681
>  x64_sys_call+0x17ba/0x3000 arch/x86/include/generated/asm/syscalls_64.h:47
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> read to 0xffff888127398c18 of 8 bytes by task 14367 on cpu 0:
>  sock_poll+0x27/0x240 net/socket.c:1427
>  vfs_poll include/linux/poll.h:82 [inline]
>  __io_arm_poll_handler+0x1ee/0xb80 io_uring/poll.c:581
>  io_poll_add+0x69/0xf0 io_uring/poll.c:899
>  __io_issue_sqe+0xfd/0x2d0 io_uring/io_uring.c:1793
>  io_issue_sqe+0x20b/0xc20 io_uring/io_uring.c:1816
>  io_queue_sqe io_uring/io_uring.c:2043 [inline]
>  io_submit_sqe io_uring/io_uring.c:2321 [inline]
>  io_submit_sqes+0x78a/0x11b0 io_uring/io_uring.c:2435
>  __do_sys_io_uring_enter io_uring/io_uring.c:3285 [inline]
>  __se_sys_io_uring_enter+0x1bf/0x1c70 io_uring/io_uring.c:3224
>  __x64_sys_io_uring_enter+0x78/0x90 io_uring/io_uring.c:3224
>  x64_sys_call+0x27e4/0x3000 arch/x86/include/generated/asm/syscalls_64.h:427
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0xc0/0x2a0 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> value changed: 0xffff88811a6056c0 -> 0xffff88811a606080
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 UID: 0 PID: 14367 Comm: syz.8.3658 Not tainted syzkaller #0 PREEMPT(voluntary)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
> ==================================================================

Here is what LLM said re harmfull-ness of this data race.
It does not look totally bogus to me. At least the read of
file->private_data in sock_poll() needs to be done with READ_ONCE to
avoid the harmful scenario. I don't know if changing the fundamental
socket function because of this it's the best solution, though.

========

The data race occurs on the `file->private_data` field of a socket
file descriptor. This field is being updated in
`smc_switch_to_fallback()` (to point to the underlying TCP/CLC socket
instead of the SMC socket) while concurrently being read in
`sock_poll()`.

### Analysis of the Race

1.  **Nature of the Access**: `smc_switch_to_fallback()` is performing
a plain write to `file->private_data`, and `sock_poll()` is performing
a plain read. There is no mutual exclusion (like a lock) or memory
barrier protecting this transition.
2.  **Type Confusion Risk**: In `sock_poll()`, the code first reads
`file->private_data` into a local variable `sock`, then reads
`sock->ops`, and finally calls `ops->poll(file, sock, wait)`. If the
compiler reloads `sock` from `file->private_data` between these steps
(which is permitted under the C memory model for non-volatile
accesses), it could fetch the `ops` from the SMC socket but then call
that `ops->poll` function (i.e., `smc_poll`) passing the TCP socket as
the `sock` argument.
3.  **Consequences of Type Confusion**: `smc_poll()` casts the `struct
socket *sock` to a `struct smc_sock *`. A TCP socket (`struct
tcp_sock`) is not compatible with `struct smc_sock`. Accessing
SMC-specific fields (like `smc->use_fallback` or `smc->conn`) on a TCP
socket object would result in reading random memory, leading to
undefined behavior, logic errors, or a kernel crash.
4.  **Inconsistent State**: Even if the compiler does not reload the
pointer, the race between setting `smc->use_fallback = true` and
updating `file->private_data` means that `sock_poll()` might see an
inconsistent state where it enters `smc_poll()` but the fallback is
already partially complete, potentially accessing uninitialized or
transitioning connection state.
5.  **Violation of Invariants**: In the Linux kernel,
`file->private_data` for a socket is generally expected to be constant
for the lifetime of the `file` object. SMC's "fallback" mechanism
violates this invariant. While the mechanism is intended to be a
performance optimization, doing so without proper synchronization
(like `READ_ONCE`/`WRITE_ONCE` or a lock) makes it unsafe.

### Conclusion
This data race is **harmful** because it can lead to type confusion
and memory corruption. It is not a simple statistics counter or a
benign flag race; it involves the fundamental identity of the socket
object being operated on.

The fix for this would typically involve using `READ_ONCE` and
`WRITE_ONCE` to prevent compiler reloads and ensure atomicity, or
better yet, avoiding the mid-flight change of `file->private_data`
altogether.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-02-06  6:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-06  6:16 [syzbot] [smc?] KCSAN: data-race in smc_switch_to_fallback / sock_poll (2) syzbot
2026-02-06  6:30 ` Dmitry Vyukov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox