netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug in tcp_rtx_synack?
@ 2022-05-30 10:08 Laurent Fasnacht
  2022-05-30 18:09 ` Eric Dumazet
  0 siblings, 1 reply; 2+ messages in thread
From: Laurent Fasnacht @ 2022-05-30 10:08 UTC (permalink / raw)
  To: netdev@vger.kernel.org

Hello,

I'm having the following bug on a 5.16 kernel, it happens periodically (a few times per day, on every of our production server that has this kernel). I'm unable to reproduce on machines with lower load and I also know for sure that it doesn't happen on the 5.10 kernel.

I wonder if it's related to trace_tcp_retransmit_synack?

I'm happy to help, let me know.

Cheers,
Laurent

---

(gdb) l *(tcp_rtx_synack+0x8d)
0xffffffff817ee76d is in tcp_rtx_synack (arch/x86/include/asm/preempt.h:95).
90       * a decrement which hits zero means we have no preempt_count and should
91       * reschedule.
92       */
93      static __always_inline bool __preempt_count_dec_and_test(void)
94      {
95              return GEN_UNARY_RMWcc("decl", __preempt_count, e, __percpu_arg([var]));
96      }
97
98      /*
99       * Returns true when we need to resched and can (barring IRQ state).
(gdb) l *(tcp_rtx_synack+0x8d-4)
0xffffffff817ee769 is in tcp_rtx_synack (include/trace/events/tcp.h:190).
185             TP_PROTO(struct sock *sk),
186
187             TP_ARGS(sk)
188     );
189
190     TRACE_EVENT(tcp_retransmit_synack,
191
192             TP_PROTO(const struct sock *sk, const struct request_sock *req),
193
194             TP_ARGS(sk, req),

--

BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180
caller is tcp_rtx_synack.part.0+0x36/0xc0
CPU: 10 PID: 2180 Comm: epollpep Tainted: G           OE     5.16.0-0.bpo.4-amd64 #1  Debian 5.16.12-1~bpo11+1
Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021
Call Trace:
 <TASK>
 dump_stack_lvl+0x48/0x5e
 check_preemption_disabled+0xde/0xe0
 tcp_rtx_synack.part.0+0x36/0xc0
 tcp_rtx_synack+0x8d/0xa0
 ? kmem_cache_alloc+0x2e0/0x3e0
 ? apparmor_file_alloc_security+0x3b/0x1f0
 inet_rtx_syn_ack+0x16/0x30
 tcp_check_req+0x367/0x610
 tcp_rcv_state_process+0x91/0xf60
 ? get_nohz_timer_target+0x18/0x1a0
 ? lock_timer_base+0x61/0x80
 ? preempt_count_add+0x68/0xa0
 tcp_v4_do_rcv+0xbd/0x270
 __release_sock+0x6d/0xb0
 release_sock+0x2b/0x90
 sock_setsockopt+0x138/0x1140
 ? __sys_getsockname+0x7e/0xc0
 ? aa_sk_perm+0x3e/0x1a0
 __sys_setsockopt+0x198/0x1e0
 __x64_sys_setsockopt+0x21/0x30
 do_syscall_64+0x38/0xc0
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fefe7d4441a
Code: ff ff ff c3 0f 1f 40 00 48 8b 15 71 ea 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 49 89 ca b8 36 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 46 ea 0b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffca1cd0ab8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fefe7d4441a
RDX: 0000000000000009 RSI: 0000000000000001 RDI: 00000000000006f3
RBP: 00007ffca1cd1410 R08: 0000000000000004 R09: 0000560e9f8a55ec
R10: 00007ffca1cd10f0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffca1cd1190 R14: 00007ffca1cd1198 R15: 00007ffca1cd23f0
 </TASK>


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Bug in tcp_rtx_synack?
  2022-05-30 10:08 Bug in tcp_rtx_synack? Laurent Fasnacht
@ 2022-05-30 18:09 ` Eric Dumazet
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Dumazet @ 2022-05-30 18:09 UTC (permalink / raw)
  To: Laurent Fasnacht, netdev@vger.kernel.org, Eric Dumazet,
	Yuchung Cheng, Neal Cardwell


On 5/30/22 03:08, Laurent Fasnacht wrote:
> Hello,
>
> I'm having the following bug on a 5.16 kernel, it happens periodically (a few times per day, on every of our production server that has this kernel). I'm unable to reproduce on machines with lower load and I also know for sure that it doesn't happen on the 5.10 kernel.
>
> I wonder if it's related to trace_tcp_retransmit_synack?
>
> I'm happy to help, let me know.
>
> Cheers,
> Laurent
>
> ---
>
> (gdb) l *(tcp_rtx_synack+0x8d)
> 0xffffffff817ee76d is in tcp_rtx_synack (arch/x86/include/asm/preempt.h:95).
> 90       * a decrement which hits zero means we have no preempt_count and should
> 91       * reschedule.
> 92       */
> 93      static __always_inline bool __preempt_count_dec_and_test(void)
> 94      {
> 95              return GEN_UNARY_RMWcc("decl", __preempt_count, e, __percpu_arg([var]));
> 96      }
> 97
> 98      /*
> 99       * Returns true when we need to resched and can (barring IRQ state).
> (gdb) l *(tcp_rtx_synack+0x8d-4)
> 0xffffffff817ee769 is in tcp_rtx_synack (include/trace/events/tcp.h:190).
> 185             TP_PROTO(struct sock *sk),
> 186
> 187             TP_ARGS(sk)
> 188     );
> 189
> 190     TRACE_EVENT(tcp_retransmit_synack,
> 191
> 192             TP_PROTO(const struct sock *sk, const struct request_sock *req),
> 193
> 194             TP_ARGS(sk, req),
>
> --
>
> BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180
> caller is tcp_rtx_synack.part.0+0x36/0xc0
> CPU: 10 PID: 2180 Comm: epollpep Tainted: G           OE     5.16.0-0.bpo.4-amd64 #1  Debian 5.16.12-1~bpo11+1
> Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021
> Call Trace:
>   <TASK>
>   dump_stack_lvl+0x48/0x5e
>   check_preemption_disabled+0xde/0xe0
>   tcp_rtx_synack.part.0+0x36/0xc0
>   tcp_rtx_synack+0x8d/0xa0
>   ? kmem_cache_alloc+0x2e0/0x3e0
>   ? apparmor_file_alloc_security+0x3b/0x1f0
>   inet_rtx_syn_ack+0x16/0x30
>   tcp_check_req+0x367/0x610
>   tcp_rcv_state_process+0x91/0xf60
>   ? get_nohz_timer_target+0x18/0x1a0
>   ? lock_timer_base+0x61/0x80
>   ? preempt_count_add+0x68/0xa0
>   tcp_v4_do_rcv+0xbd/0x270
>   __release_sock+0x6d/0xb0
>   release_sock+0x2b/0x90
>   sock_setsockopt+0x138/0x1140
>   ? __sys_getsockname+0x7e/0xc0
>   ? aa_sk_perm+0x3e/0x1a0
>   __sys_setsockopt+0x198/0x1e0
>   __x64_sys_setsockopt+0x21/0x30
>   do_syscall_64+0x38/0xc0
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7fefe7d4441a
> Code: ff ff ff c3 0f 1f 40 00 48 8b 15 71 ea 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 49 89 ca b8 36 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 46 ea 0b 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffca1cd0ab8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
> RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fefe7d4441a
> RDX: 0000000000000009 RSI: 0000000000000001 RDI: 00000000000006f3
> RBP: 00007ffca1cd1410 R08: 0000000000000004 R09: 0000560e9f8a55ec
> R10: 00007ffca1cd10f0 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007ffca1cd1190 R14: 00007ffca1cd1198 R15: 00007ffca1cd23f0
>   </TASK>
>

I think this is related to FastOpen server code.

It seems that we retransmit a SYNACK if we receive the SYN another time.

This means tcp_rtx_synack() can be called from process context (instead 
of BH handler) in this particular case.

Fix could be:

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 
b4b2284ed4a2c9e2569bd945e3b4e023c5502f25..1c054431e358328fe3849f5a45aaa88308a1e1c8 
100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -4115,8 +4115,8 @@ int tcp_rtx_synack(const struct sock *sk, struct 
request_sock *req)
         res = af_ops->send_synack(sk, NULL, &fl, req, NULL, 
TCP_SYNACK_NORMAL,
                                   NULL);
         if (!res) {
-               __TCP_INC_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS);
-               __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
+               TCP_INC_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS);
+               NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
                 if (unlikely(tcp_passive_fastopen(sk)))
                         tcp_sk(sk)->total_retrans++;
                 trace_tcp_retransmit_synack(sk, req);



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-05-30 18:09 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-30 10:08 Bug in tcp_rtx_synack? Laurent Fasnacht
2022-05-30 18:09 ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).