Netdev List
 help / color / mirror / Atom feed
* [syzbot] [net?] WARNING in tls_err_abort
@ 2026-06-16 14:27 syzbot
  2026-06-16 15:19 ` Sabrina Dubroca
  0 siblings, 1 reply; 6+ messages in thread
From: syzbot @ 2026-06-16 14:27 UTC (permalink / raw)
  To: davem, edumazet, horms, john.fastabend, kuba, linux-kernel,
	netdev, pabeni, sd, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    f6033078a9e6 ip6_tunnel: annotate data-races around t->err..
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=122a98ae580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8697a140486f5628
dashboard link: https://syzkaller.appspot.com/bug?extid=cca46a9d1276f38af2ae
compiler:       Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/7af9eb2b9b5a/disk-f6033078.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/4b7e03b76e68/vmlinux-f6033078.xz
kernel image: https://storage.googleapis.com/syzbot-assets/38042dd09caa/bzImage-f6033078.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+cca46a9d1276f38af2ae@syzkaller.appspotmail.com

------------[ cut here ]------------
err >= 0
WARNING: net/tls/tls_sw.c:73 at tls_err_abort+0x5d/0x80 net/tls/tls_sw.c:73, CPU#0: kworker/0:11/6099
Modules linked in:
CPU: 0 UID: 0 PID: 6099 Comm: kworker/0:11 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
Workqueue: pencrypt_serial padata_serial_worker
RIP: 0010:tls_err_abort+0x5d/0x80 net/tls/tls_sw.c:73
Code: e8 03 48 b9 00 00 00 00 00 fc ff df 0f b6 04 08 84 c0 75 1b 89 ab 9c 01 00 00 48 89 df 5b 5d e9 c9 a2 32 ff e8 a4 60 8a f7 90 <0f> 0b 90 eb c3 89 f9 80 e1 07 80 c1 03 38 c1 7c d9 e8 1d 9f f5 f7
RSP: 0018:ffffc900069379e0 EFLAGS: 00010293
RAX: ffffffff8a3adf8c RBX: ffff88807d1e0d80 RCX: ffff888058bfdd00
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000ffffffff
RBP: 0000000000000000 R08: ffffe8ffffc513e3 R09: 1ffffd1ffff8a27c
R10: dffffc0000000000 R11: ffffffff8a3c4d70 R12: ffff888028eaf400
R13: ffff88804441030c R14: dffffc0000000000 R15: ffff888028eaf460
FS:  0000000000000000(0000) GS:ffff8881252a0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f521f503ff8 CR3: 0000000086fc2000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 tls_encrypt_done+0x223/0x480 net/tls/tls_sw.c:500
 padata_serial_worker+0x2b9/0x430 kernel/padata.c:343
 process_one_work kernel/workqueue.c:3314 [inline]
 process_scheduled_works+0xa8e/0x14e0 kernel/workqueue.c:3397
 worker_thread+0xa47/0xfb0 kernel/workqueue.c:3478
 kthread+0x389/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] WARNING in tls_err_abort
  2026-06-16 14:27 [syzbot] [net?] WARNING in tls_err_abort syzbot
@ 2026-06-16 15:19 ` Sabrina Dubroca
  2026-06-16 15:28   ` Jakub Kicinski
  0 siblings, 1 reply; 6+ messages in thread
From: Sabrina Dubroca @ 2026-06-16 15:19 UTC (permalink / raw)
  To: syzbot
  Cc: davem, edumazet, horms, john.fastabend, kuba, linux-kernel,
	netdev, pabeni, syzkaller-bugs

2026-06-16, 07:27:20 -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    f6033078a9e6 ip6_tunnel: annotate data-races around t->err..
> git tree:       net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=122a98ae580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8697a140486f5628
> dashboard link: https://syzkaller.appspot.com/bug?extid=cca46a9d1276f38af2ae
> compiler:       Debian clang version 22.1.6 (++20260514074242+fc4aad7b5db3-1~exp1~20260514074407.73), Debian LLD 22.1.6
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/7af9eb2b9b5a/disk-f6033078.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/4b7e03b76e68/vmlinux-f6033078.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/38042dd09caa/bzImage-f6033078.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+cca46a9d1276f38af2ae@syzkaller.appspotmail.com
> 
> ------------[ cut here ]------------
> err >= 0
> WARNING: net/tls/tls_sw.c:73 at tls_err_abort+0x5d/0x80 net/tls/tls_sw.c:73, CPU#0: kworker/0:11/6099
> Modules linked in:
> CPU: 0 UID: 0 PID: 6099 Comm: kworker/0:11 Not tainted syzkaller #0 PREEMPT(full) 
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
> Workqueue: pencrypt_serial padata_serial_worker
> RIP: 0010:tls_err_abort+0x5d/0x80 net/tls/tls_sw.c:73
> Code: e8 03 48 b9 00 00 00 00 00 fc ff df 0f b6 04 08 84 c0 75 1b 89 ab 9c 01 00 00 48 89 df 5b 5d e9 c9 a2 32 ff e8 a4 60 8a f7 90 <0f> 0b 90 eb c3 89 f9 80 e1 07 80 c1 03 38 c1 7c d9 e8 1d 9f f5 f7
> RSP: 0018:ffffc900069379e0 EFLAGS: 00010293
> RAX: ffffffff8a3adf8c RBX: ffff88807d1e0d80 RCX: ffff888058bfdd00
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000ffffffff
> RBP: 0000000000000000 R08: ffffe8ffffc513e3 R09: 1ffffd1ffff8a27c
> R10: dffffc0000000000 R11: ffffffff8a3c4d70 R12: ffff888028eaf400
> R13: ffff88804441030c R14: dffffc0000000000 R15: ffff888028eaf460
> FS:  0000000000000000(0000) GS:ffff8881252a0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f521f503ff8 CR3: 0000000086fc2000 CR4: 00000000003526f0
> Call Trace:
>  <TASK>
>  tls_encrypt_done+0x223/0x480 net/tls/tls_sw.c:500


	/* Check if error is previously set on socket */
	if (err || sk->sk_err) {
		rec = NULL;

		/* If err is already set on socket, return the same code */
		if (sk->sk_err) {
			ctx->async_wait.err = -sk->sk_err;
		} else {
			ctx->async_wait.err = err;
			tls_err_abort(sk, err);
		}
	}

I suspect err==0, and sock_error() consumed sk_err in between (the
alternative would be err > 0).

Something like this?

-------- 8< --------
@@ -473,6 +473,7 @@ static void tls_encrypt_done(void *data, int err)
 	struct scatterlist *sge;
 	struct sk_msg *msg_en;
 	struct sock *sk;
+	int sk_err;
 
 	if (err == -EINPROGRESS) /* see the comment in tls_decrypt_done() */
 		return;
@@ -489,12 +490,13 @@ static void tls_encrypt_done(void *data, int err)
 	sge->length += prot->prepend_size;
 
 	/* Check if error is previously set on socket */
-	if (err || sk->sk_err) {
+	sk_err = READ_ONCE(sk->sk_err);
+	if (err || sk_err) {
 		rec = NULL;
 
 		/* If err is already set on socket, return the same code */
-		if (sk->sk_err) {
-			ctx->async_wait.err = -sk->sk_err;
+		if (sk_err) {
+			ctx->async_wait.err = -sk_err;
 		} else {
 			ctx->async_wait.err = err;
 			tls_err_abort(sk, err);

-- 
Sabrina

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] WARNING in tls_err_abort
  2026-06-16 15:19 ` Sabrina Dubroca
@ 2026-06-16 15:28   ` Jakub Kicinski
  2026-06-16 21:00     ` Sabrina Dubroca
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2026-06-16 15:28 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: syzbot, davem, edumazet, horms, john.fastabend, linux-kernel,
	netdev, pabeni, syzkaller-bugs

On Tue, 16 Jun 2026 17:19:22 +0200 Sabrina Dubroca wrote:
> I suspect err==0, and sock_error() consumed sk_err in between (the
> alternative would be err > 0).
> 
> Something like this?

Makes sense, but what's eating sk_err? Don't we depend on it being set
to avoid further state transitions once we hit a crypto error?
I thought that's why we don't consume sk_err in recvmsg and sendmsg in
the first place (we are not calling sock_error() anywhere)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] WARNING in tls_err_abort
  2026-06-16 15:28   ` Jakub Kicinski
@ 2026-06-16 21:00     ` Sabrina Dubroca
  2026-06-16 21:23       ` Jakub Kicinski
  0 siblings, 1 reply; 6+ messages in thread
From: Sabrina Dubroca @ 2026-06-16 21:00 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: syzbot, davem, edumazet, horms, john.fastabend, linux-kernel,
	netdev, pabeni, syzkaller-bugs

2026-06-16, 08:28:16 -0700, Jakub Kicinski wrote:
> On Tue, 16 Jun 2026 17:19:22 +0200 Sabrina Dubroca wrote:
> > I suspect err==0, and sock_error() consumed sk_err in between (the
> > alternative would be err > 0).
> > 
> > Something like this?
> 
> Makes sense, but what's eating sk_err?

The 2 remaining sock_error() in tls_rx_rec_wait()? [1]

> Don't we depend on it being set
> to avoid further state transitions once we hit a crypto error?

I kind of thought so too.

> I thought that's why we don't consume sk_err in recvmsg and sendmsg in
> the first place (we are not calling sock_error() anywhere)

Umm...
[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/tls/tls_sw.c#n1095

-- 
Sabrina

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] WARNING in tls_err_abort
  2026-06-16 21:00     ` Sabrina Dubroca
@ 2026-06-16 21:23       ` Jakub Kicinski
  2026-06-16 21:46         ` Sabrina Dubroca
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Kicinski @ 2026-06-16 21:23 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: syzbot, davem, edumazet, horms, john.fastabend, linux-kernel,
	netdev, pabeni, syzkaller-bugs

On Tue, 16 Jun 2026 23:00:54 +0200 Sabrina Dubroca wrote:
> 2026-06-16, 08:28:16 -0700, Jakub Kicinski wrote:
> > On Tue, 16 Jun 2026 17:19:22 +0200 Sabrina Dubroca wrote:  
> > > I suspect err==0, and sock_error() consumed sk_err in between (the
> > > alternative would be err > 0).
> > > 
> > > Something like this?  
> > 
> > Makes sense, but what's eating sk_err?  
> 
> The 2 remaining sock_error() in tls_rx_rec_wait()? [1]

How did that elude my grep..

> > Don't we depend on it being set
> > to avoid further state transitions once we hit a crypto error?  
> 
> I kind of thought so too.

In which case the question is whether we should try to remove 
the sock_error() instead? (stating the obvious I guess)

> > I thought that's why we don't consume sk_err in recvmsg and sendmsg in
> > the first place (we are not calling sock_error() anywhere)  
> 
> Umm...
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/tls/tls_sw.c#n1095
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [syzbot] [net?] WARNING in tls_err_abort
  2026-06-16 21:23       ` Jakub Kicinski
@ 2026-06-16 21:46         ` Sabrina Dubroca
  0 siblings, 0 replies; 6+ messages in thread
From: Sabrina Dubroca @ 2026-06-16 21:46 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: syzbot, davem, edumazet, horms, john.fastabend, linux-kernel,
	netdev, pabeni, syzkaller-bugs

2026-06-16, 14:23:59 -0700, Jakub Kicinski wrote:
> On Tue, 16 Jun 2026 23:00:54 +0200 Sabrina Dubroca wrote:
> > 2026-06-16, 08:28:16 -0700, Jakub Kicinski wrote:
> > > On Tue, 16 Jun 2026 17:19:22 +0200 Sabrina Dubroca wrote:  
> > > > I suspect err==0, and sock_error() consumed sk_err in between (the
> > > > alternative would be err > 0).
> > > > 
> > > > Something like this?  
> > > 
> > > Makes sense, but what's eating sk_err?  
> > 
> > The 2 remaining sock_error() in tls_rx_rec_wait()? [1]
> 
> How did that elude my grep..

:)

> > > Don't we depend on it being set
> > > to avoid further state transitions once we hit a crypto error?  
> > 
> > I kind of thought so too.
> 
> In which case the question is whether we should try to remove 
> the sock_error() instead? (stating the obvious I guess)

That would make sense, but we can't prevent sock_error() being called
from some helper.

The only relevant one for ktls at the moment seems to be
sk_stream_error(), and I think via sk_stream_wait_memory() we can hit
that EPIPE.


tls_sw_sendmsg_locked has
...
end:
	ret = sk_stream_error(sk, msg->msg_flags, ret);
	return copied > 0 ? copied : ret;


int sk_stream_error(struct sock *sk, int flags, int err)
{
	if (err == -EPIPE)
		err = sock_error(sk) ? : -EPIPE;
...

-- 
Sabrina

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-16 21:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 14:27 [syzbot] [net?] WARNING in tls_err_abort syzbot
2026-06-16 15:19 ` Sabrina Dubroca
2026-06-16 15:28   ` Jakub Kicinski
2026-06-16 21:00     ` Sabrina Dubroca
2026-06-16 21:23       ` Jakub Kicinski
2026-06-16 21:46         ` Sabrina Dubroca

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox