The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [syzbot] [wireguard?] KCSAN: data-race in wg_socket_send_skb_to_peer / wg_socket_send_skb_to_peer (9)
@ 2026-06-01 14:33 syzbot
  2026-06-22 19:34 ` Rafael Passos
  0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2026-06-01 14:33 UTC (permalink / raw)
  To: Jason, andrew+netdev, davem, edumazet, kuba, linux-kernel, netdev,
	pabeni, syzkaller-bugs, wireguard

Hello,

syzbot found the following issue on:

HEAD commit:    9215e74f228f Merge tag 'block-7.1-20260529' of git://git.k..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10465ef2580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f571f22917457cd8
dashboard link: https://syzkaller.appspot.com/bug?extid=9ca7674fa7521a3f1bc2
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/1ddf3069118d/disk-9215e74f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/0913e4ffbdb8/vmlinux-9215e74f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3fe3943ae796/bzImage-9215e74f.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+9ca7674fa7521a3f1bc2@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in wg_socket_send_skb_to_peer / wg_socket_send_skb_to_peer

read-write to 0xffff88811af99028 of 8 bytes by task 310 on cpu 1:
 wg_socket_send_skb_to_peer+0xe8/0x130 drivers/net/wireguard/socket.c:182
 wg_socket_send_buffer_to_peer+0xf1/0x120 drivers/net/wireguard/socket.c:199
 wg_packet_send_handshake_initiation drivers/net/wireguard/send.c:40 [inline]
 wg_packet_handshake_send_worker+0x10d/0x160 drivers/net/wireguard/send.c:51
 process_one_work kernel/workqueue.c:3314 [inline]
 process_scheduled_works+0x4f0/0x9c0 kernel/workqueue.c:3397
 worker_thread+0x58a/0x780 kernel/workqueue.c:3478
 kthread+0x22a/0x280 kernel/kthread.c:436
 ret_from_fork+0x146/0x330 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

read-write to 0xffff88811af99028 of 8 bytes by task 15360 on cpu 0:
 wg_socket_send_skb_to_peer+0xe8/0x130 drivers/net/wireguard/socket.c:182
 wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
 wg_packet_tx_worker+0x12d/0x330 drivers/net/wireguard/send.c:276
 process_one_work kernel/workqueue.c:3314 [inline]
 process_scheduled_works+0x4f0/0x9c0 kernel/workqueue.c:3397
 worker_thread+0x58a/0x780 kernel/workqueue.c:3478
 kthread+0x22a/0x280 kernel/kthread.c:436
 ret_from_fork+0x146/0x330 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

value changed: 0x0000000000000a2c -> 0x0000000000000ac0

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 15360 Comm: kworker/0:2 Tainted: G        W           syzkaller #0 PREEMPT(lazy) 
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
Workqueue: wg-crypt-wg2 wg_packet_tx_worker
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [wireguard?] KCSAN: data-race in wg_socket_send_skb_to_peer / wg_socket_send_skb_to_peer (9)
  2026-06-01 14:33 [syzbot] [wireguard?] KCSAN: data-race in wg_socket_send_skb_to_peer / wg_socket_send_skb_to_peer (9) syzbot
@ 2026-06-22 19:34 ` Rafael Passos
  2026-06-28 20:38   ` [PATCH] Wireguard: Fix data-race in rx/tx counter Rafael Passos
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael Passos @ 2026-06-22 19:34 UTC (permalink / raw)
  To: Jason, andrew+netdev, davem, edumazet, kuba, linux-kernel, netdev,
	pabeni, syzkaller-bugs, wireguard, syzbot

Hi,

I started investigating this KCSAN warning by syzbot, and would like to
ask a few questions.

On Mon Jun 1, 2026 at 11:33 AM -03, syzbot wrote:
> ==================================================================
> BUG: KCSAN: data-race in wg_socket_send_skb_to_peer / wg_socket_send_skb_to_peer
>
> read-write to 0xffff88811af99028 of 8 bytes by task 310 on cpu 1:
>  wg_socket_send_skb_to_peer+0xe8/0x130 drivers/net/wireguard/socket.c:182
>  wg_socket_send_buffer_to_peer+0xf1/0x120 drivers/net/wireguard/socket.c:199
>  wg_packet_send_handshake_initiation drivers/net/wireguard/send.c:40 [inline]
>  wg_packet_handshake_send_worker+0x10d/0x160 drivers/net/wireguard/send.c:51
>  process_one_work kernel/workqueue.c:3314 [inline]
>  process_scheduled_works+0x4f0/0x9c0 kernel/workqueue.c:3397
>  worker_thread+0x58a/0x780 kernel/workqueue.c:3478
>  kthread+0x22a/0x280 kernel/kthread.c:436
>  ret_from_fork+0x146/0x330 arch/x86/kernel/process.c:158
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>
> read-write to 0xffff88811af99028 of 8 bytes by task 15360 on cpu 0:
>  wg_socket_send_skb_to_peer+0xe8/0x130 drivers/net/wireguard/socket.c:182
>  wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
>  wg_packet_tx_worker+0x12d/0x330 drivers/net/wireguard/send.c:276
>  process_one_work kernel/workqueue.c:3314 [inline]
>  process_scheduled_works+0x4f0/0x9c0 kernel/workqueue.c:3397
>  worker_thread+0x58a/0x780 kernel/workqueue.c:3478
>  kthread+0x22a/0x280 kernel/kthread.c:436
>  ret_from_fork+0x146/0x330 arch/x86/kernel/process.c:158
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>
> value changed: 0x0000000000000a2c -> 0x0000000000000ac0
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 UID: 0 PID: 15360 Comm: kworker/0:2 Tainted: G        W           syzkaller #0 PREEMPT(lazy) 
> Tainted: [W]=WARN
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
> Workqueue: wg-crypt-wg2 wg_packet_tx_worker

I tracked the change to this counter increment in `wg_socket_send_skb_to_peer`

+++ b/drivers/net/wireguard/socket.c
@@ -179,7 +179,8 @@ int wg_socket_send_skb_to_peer(struct wg_peer *peer, struct sk_buff *skb, u8 ds)
 	else
 		dev_kfree_skb(skb);
 	if (likely(!ret))
->		peer->tx_bytes += skb_len;  <- protected by a read_lock_bh only
 	read_unlock_bh(&peer->endpoint_lock);

It is protected by the read-part of a rwlock.
However, if the stack trace makes sense, this `wg_socket_send_skb_to_peer`
is being called after a handshake (wg_packet_send_handshake_initiation) and
a send worker call (wg_packet_tx_worker).

Does this make sense ? Are such calls possible to really hapen outside of fuzzing ?

Out of curiosity, I changed `tx_bytes` and `rx_bytes` from u64 to atomic64_t
in peer.h, and also the r/w ops in netlink.c, receive.c and socket.c files.
I ran the wireguard kselftest suite with and without this patch, and it
worked fine. Iperf results seem sine (on amd64).
I'm not sure if this should be the solution, or if this is even a real issue in the first place.

Any comments ?

Eager to learn.
Thanks,

Rafael Passos

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Wireguard: Fix data-race in rx/tx counter
  2026-06-22 19:34 ` Rafael Passos
@ 2026-06-28 20:38   ` Rafael Passos
  2026-06-28 21:02     ` Andrew Lunn
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael Passos @ 2026-06-28 20:38 UTC (permalink / raw)
  To: rafael
  Cc: Jason, andrew+netdev, davem, edumazet, kuba, linux-kernel, netdev,
	pabeni, syzbot+9ca7674fa7521a3f1bc2, syzkaller-bugs, wireguard

fixes data-race in {rx/tx}_bytes counter for wireguard connection.
these values were incremented inside a read_lock_bh block, but write
protections were missing. making them atomic was the simplest way out.
This was found by syzbot with kcsan.

Reported-by: syzbot+9ca7674fa7521a3f1bc2@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=9ca7674fa7521a3f1bc2
Signed-off-by: Rafael Passos <rafael@rcpassos.me>
---

Hi,

I am posting this patch to better ilustrate the discussion.
If this is a non-issue, its fine.
As I mentioned in the previous email, this issue was reported by syzbot,
but I was not able to reproduce it.
I am also aware atomic calls may introduce extra cost on older arm cpus.
I would like to hear from the community: would this an adequate solution ?

Thanks,

Rafael Passos



 drivers/net/wireguard/netlink.c | 4 ++--
 drivers/net/wireguard/peer.h    | 2 +-
 drivers/net/wireguard/receive.c | 2 +-
 drivers/net/wireguard/socket.c  | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c
index 1da7e98d0d509..ec66f79e46377 100644
--- a/drivers/net/wireguard/netlink.c
+++ b/drivers/net/wireguard/netlink.c
@@ -109,9 +109,9 @@ get_peer(struct wg_peer *peer, struct sk_buff *skb, struct dump_ctx *ctx)
 			    sizeof(last_handshake), &last_handshake) ||
 		    nla_put_u16(skb, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
 				peer->persistent_keepalive_interval) ||
-		    nla_put_u64_64bit(skb, WGPEER_A_TX_BYTES, peer->tx_bytes,
+		    nla_put_u64_64bit(skb, WGPEER_A_TX_BYTES, atomic64_read(&peer->tx_bytes),
 				      WGPEER_A_UNSPEC) ||
-		    nla_put_u64_64bit(skb, WGPEER_A_RX_BYTES, peer->rx_bytes,
+		    nla_put_u64_64bit(skb, WGPEER_A_RX_BYTES, atomic64_read(&peer->rx_bytes),
 				      WGPEER_A_UNSPEC) ||
 		    nla_put_u32(skb, WGPEER_A_PROTOCOL_VERSION, 1))
 			goto err;
diff --git a/drivers/net/wireguard/peer.h b/drivers/net/wireguard/peer.h
index 718fb42bdac7e..01c4b80086759 100644
--- a/drivers/net/wireguard/peer.h
+++ b/drivers/net/wireguard/peer.h
@@ -49,7 +49,7 @@ struct wg_peer {
 	struct work_struct transmit_handshake_work, clear_peer_work, transmit_packet_work;
 	struct cookie latest_cookie;
 	struct hlist_node pubkey_hash;
-	u64 rx_bytes, tx_bytes;
+	atomic64_t rx_bytes, tx_bytes;
 	struct timer_list timer_retransmit_handshake, timer_send_keepalive;
 	struct timer_list timer_new_handshake, timer_zero_key_material;
 	struct timer_list timer_persistent_keepalive;
diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
index eb8851113654f..500d86576c692 100644
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -20,7 +20,7 @@
 static void update_rx_stats(struct wg_peer *peer, size_t len)
 {
 	dev_sw_netstats_rx_add(peer->device->dev, len);
-	peer->rx_bytes += len;
+	atomic64_add(len, &peer->rx_bytes);
 }
 
 #define SKB_TYPE_LE32(skb) (((struct message_header *)(skb)->data)->type)
diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c
index 0028ef17dc716..9e8a49b9078f2 100644
--- a/drivers/net/wireguard/socket.c
+++ b/drivers/net/wireguard/socket.c
@@ -179,7 +179,7 @@ int wg_socket_send_skb_to_peer(struct wg_peer *peer, struct sk_buff *skb, u8 ds)
 	else
 		dev_kfree_skb(skb);
 	if (likely(!ret))
-		peer->tx_bytes += skb_len;
+		atomic64_add(skb_len, &peer->tx_bytes);
 	read_unlock_bh(&peer->endpoint_lock);
 
 	return ret;
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Wireguard: Fix data-race in rx/tx counter
  2026-06-28 20:38   ` [PATCH] Wireguard: Fix data-race in rx/tx counter Rafael Passos
@ 2026-06-28 21:02     ` Andrew Lunn
  2026-06-29  2:34       ` Theodore Tso
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Lunn @ 2026-06-28 21:02 UTC (permalink / raw)
  To: Rafael Passos
  Cc: Jason, andrew+netdev, davem, edumazet, kuba, linux-kernel, netdev,
	pabeni, syzbot+9ca7674fa7521a3f1bc2, syzkaller-bugs, wireguard

On Sun, Jun 28, 2026 at 05:38:23PM -0300, Rafael Passos wrote:
> fixes data-race in {rx/tx}_bytes counter for wireguard connection.
> these values were incremented inside a read_lock_bh block, but write
> protections were missing. making them atomic was the simplest way out.
> This was found by syzbot with kcsan.
> 
> Reported-by: syzbot+9ca7674fa7521a3f1bc2@syzkaller.appspotmail.com
> Link: https://syzkaller.appspot.com/bug?extid=9ca7674fa7521a3f1bc2
> Signed-off-by: Rafael Passos <rafael@rcpassos.me>
> ---
> 
> Hi,
> 
> I am posting this patch to better ilustrate the discussion.
> If this is a non-issue, its fine.
> As I mentioned in the previous email, this issue was reported by syzbot,
> but I was not able to reproduce it.
> I am also aware atomic calls may introduce extra cost on older arm cpus.

Atomics are expensive in general, especially on high CPU count
systems. 

Statistic counters tend to be very asymmetric in usage. They are
incremented frequently, maybe per packet, but reported very
infrequently, maybe every minute when an SNMP agent reads them. So the
solution to statistic counters should reflect this. Increment should
be very cheap, reporting them can be expensive.

There are a few different solutions. Per CPU counters is
one. u64_stats_sync.h may help.

Please take a look at other drivers doing statistics. This is a solved
problem, you just need to copy bits of code from somewhere else.

      Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Wireguard: Fix data-race in rx/tx counter
  2026-06-28 21:02     ` Andrew Lunn
@ 2026-06-29  2:34       ` Theodore Tso
  2026-06-29  3:05         ` Rafael Passos
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Tso @ 2026-06-29  2:34 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Rafael Passos, Jason, andrew+netdev, davem, edumazet, kuba,
	linux-kernel, netdev, pabeni, syzbot+9ca7674fa7521a3f1bc2,
	syzkaller-bugs, wireguard

On Sun, Jun 28, 2026 at 11:02:05PM -0500, Andrew Lunn wrote:
> On Sun, Jun 28, 2026 at 05:38:23PM -0300, Rafael Passos wrote:
> > fixes data-race in {rx/tx}_bytes counter for wireguard connection.
> > these values were incremented inside a read_lock_bh block, but write
> > protections were missing. making them atomic was the simplest way out.
> > This was found by syzbot with kcsan.
> 
> Atomics are expensive in general, especially on high CPU count
> systems. 
> 
> Statistic counters tend to be very asymmetric in usage. They are
> incremented frequently, maybe per packet, but reported very
> infrequently, maybe every minute when an SNMP agent reads them.

One of the reasons why kcsan and syzbot can be quite noisy is that a
human being needs to *think* and consider whether or not this is
actually important.  (One of the reasons why I'm not all that worried
about our new AI overlords taking over the world.  :-) Consider what
is the worst that might happen if the tx/rx_bytes counter might not be
completely accurate?  Is it worth the performance penalty of using
atomics (or the memory overhead of per-CPU counters)?

	    	       		   - Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Wireguard: Fix data-race in rx/tx counter
  2026-06-29  2:34       ` Theodore Tso
@ 2026-06-29  3:05         ` Rafael Passos
  0 siblings, 0 replies; 6+ messages in thread
From: Rafael Passos @ 2026-06-29  3:05 UTC (permalink / raw)
  To: Theodore Tso, Andrew Lunn
  Cc: Rafael Passos, Jason, andrew+netdev, davem, edumazet, kuba,
	linux-kernel, netdev, pabeni, syzbot+9ca7674fa7521a3f1bc2,
	syzkaller-bugs, wireguard

On Sun Jun 28, 2026 at 11:34 PM -03, Theodore Tso wrote:
> One of the reasons why kcsan and syzbot can be quite noisy is that a
> human being needs to *think* and consider whether or not this is
> actually important.  (One of the reasons why I'm not all that worried
> about our new AI overlords taking over the world.  :-) Consider what
> is the worst that might happen if the tx/rx_bytes counter might not be
> completely accurate?  Is it worth the performance penalty of using
> atomics (or the memory overhead of per-CPU counters)?
Yeah, I guess not.
Still, it was very interesting learning all this. I only knew per-cpu
counters by name, and Andrew's response led me to actually understand it.

I would like to thank you both!
And if I may, I would like to send my v2 patch (as a response in this
thread) just because it was very fun making and testing it. And I would
love feedback on it, if anything looks wrong. This was an amazing
learning opportunity for me.

Thanks,
Rafael Passos

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-29  3:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 14:33 [syzbot] [wireguard?] KCSAN: data-race in wg_socket_send_skb_to_peer / wg_socket_send_skb_to_peer (9) syzbot
2026-06-22 19:34 ` Rafael Passos
2026-06-28 20:38   ` [PATCH] Wireguard: Fix data-race in rx/tx counter Rafael Passos
2026-06-28 21:02     ` Andrew Lunn
2026-06-29  2:34       ` Theodore Tso
2026-06-29  3:05         ` Rafael Passos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox