* [PATCH 1/2] tcp: fix races in tcp_abort() [not found] <CGME20250314092125epcas2p418cd0caeffc32b05fba4fdd2e4ffb9fa@epcas2p4.samsung.com> @ 2025-03-14 9:24 ` Youngmin Nam [not found] ` <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com> 2025-03-14 12:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Greg KH 0 siblings, 2 replies; 6+ messages in thread From: Youngmin Nam @ 2025-03-14 9:24 UTC (permalink / raw) To: stable Cc: ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms, guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min, hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro, youngmin.nam, cmllamas, willdeacon, maennich, gregkh From: Eric Dumazet <edumazet@google.com> tcp_abort() has the same issue than the one fixed in the prior patch in tcp_write_err(). commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream. To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4, this patch must be applied first. In order to get consistent results from tcp_poll(), we must call sk_error_report() after tcp_done(). We can use tcp_done_with_error() to centralize this logic. Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Cc: <stable@vger.kernel.org> # v5.10+ [youngmin: Resolved minor conflict in net/ipv4/tcp.c] Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com> --- net/ipv4/tcp.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 7ad82be40f34..9fe164aa185c 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4630,13 +4630,9 @@ int tcp_abort(struct sock *sk, int err) bh_lock_sock(sk); if (!sock_flag(sk, SOCK_DEAD)) { - WRITE_ONCE(sk->sk_err, err); - /* This barrier is coupled with smp_rmb() in tcp_poll() */ - smp_wmb(); - sk_error_report(sk); if (tcp_need_reset(sk->sk_state)) tcp_send_active_reset(sk, GFP_ATOMIC); - tcp_done(sk); + tcp_done_with_error(sk, err); } bh_unlock_sock(sk); -- 2.39.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
[parent not found: <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com>]
* [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort [not found] ` <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com> @ 2025-03-14 9:24 ` Youngmin Nam 2025-03-14 12:24 ` Greg KH 0 siblings, 1 reply; 6+ messages in thread From: Youngmin Nam @ 2025-03-14 9:24 UTC (permalink / raw) To: stable Cc: ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms, guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min, hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro, youngmin.nam, cmllamas, willdeacon, maennich, gregkh, Lorenzo Colitti, Jason Xing From: Xueming Feng <kuro@kuroa.me> commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream. We have some problem closing zero-window fin-wait-1 tcp sockets in our environment. This patch come from the investigation. Previously tcp_abort only sends out reset and calls tcp_done when the socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only purging the write queue, but not close the socket and left it to the timer. While purging the write queue, tp->packets_out and sk->sk_write_queue is cleared along the way. However tcp_retransmit_timer have early return based on !tp->packets_out and tcp_probe_timer have early return based on !sk->sk_write_queue. This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched and socket not being killed by the timers, converting a zero-windowed orphan into a forever orphan. This patch removes the SOCK_DEAD check in tcp_abort, making it send reset to peer and close the socket accordingly. Preventing the timer-less orphan from happening. According to Lorenzo's email in the v1 thread, the check was there to prevent force-closing the same socket twice. That situation is handled by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is already closed. The -ENOENT code comes from the associate patch Lorenzo made for iproute2-ss; link attached below, which also conform to RFC 9293. At the end of the patch, tcp_write_queue_purge(sk) is removed because it was already called in tcp_done_with_error(). p.s. This is the same patch with v2. Resent due to mis-labeled "changes requested" on patchwork.kernel.org. Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/ Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Xueming Feng <kuro@kuroa.me> Tested-by: Lorenzo Colitti <lorenzo@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me Signed-off-by: Jakub Kicinski <kuba@kernel.org> Cc: <stable@vger.kernel.org> # v5.10+ Link: https://lore.kernel.org/lkml/Z9OZS%2Fhc+v5og6%2FU@perf/ [youngmin: Resolved minor conflict in net/ipv4/tcp.c] Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com> --- net/ipv4/tcp.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 9fe164aa185c..ff22060f9145 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4620,6 +4620,13 @@ int tcp_abort(struct sock *sk, int err) /* Don't race with userspace socket closes such as tcp_close. */ lock_sock(sk); + /* Avoid closing the same socket twice. */ + if (sk->sk_state == TCP_CLOSE) { + if (!has_current_bpf_ctx()) + release_sock(sk); + return -ENOENT; + } + if (sk->sk_state == TCP_LISTEN) { tcp_set_state(sk, TCP_CLOSE); inet_csk_listen_stop(sk); @@ -4629,15 +4636,12 @@ int tcp_abort(struct sock *sk, int err) local_bh_disable(); bh_lock_sock(sk); - if (!sock_flag(sk, SOCK_DEAD)) { - if (tcp_need_reset(sk->sk_state)) - tcp_send_active_reset(sk, GFP_ATOMIC); - tcp_done_with_error(sk, err); - } + if (tcp_need_reset(sk->sk_state)) + tcp_send_active_reset(sk, GFP_ATOMIC); + tcp_done_with_error(sk, err); bh_unlock_sock(sk); local_bh_enable(); - tcp_write_queue_purge(sk); if (!has_current_bpf_ctx()) release_sock(sk); return 0; -- 2.39.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort 2025-03-14 9:24 ` [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort Youngmin Nam @ 2025-03-14 12:24 ` Greg KH 2025-03-17 4:32 ` Youngmin Nam 0 siblings, 1 reply; 6+ messages in thread From: Greg KH @ 2025-03-14 12:24 UTC (permalink / raw) To: Youngmin Nam Cc: stable, ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms, guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min, hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro, cmllamas, willdeacon, maennich, gregkh, Lorenzo Colitti, Jason Xing On Fri, Mar 14, 2025 at 06:24:46PM +0900, Youngmin Nam wrote: > From: Xueming Feng <kuro@kuroa.me> > > commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream. > > We have some problem closing zero-window fin-wait-1 tcp sockets in our > environment. This patch come from the investigation. > > Previously tcp_abort only sends out reset and calls tcp_done when the > socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only > purging the write queue, but not close the socket and left it to the > timer. > > While purging the write queue, tp->packets_out and sk->sk_write_queue > is cleared along the way. However tcp_retransmit_timer have early > return based on !tp->packets_out and tcp_probe_timer have early > return based on !sk->sk_write_queue. > > This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched > and socket not being killed by the timers, converting a zero-windowed > orphan into a forever orphan. > > This patch removes the SOCK_DEAD check in tcp_abort, making it send > reset to peer and close the socket accordingly. Preventing the > timer-less orphan from happening. > > According to Lorenzo's email in the v1 thread, the check was there to > prevent force-closing the same socket twice. That situation is handled > by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is > already closed. > > The -ENOENT code comes from the associate patch Lorenzo made for > iproute2-ss; link attached below, which also conform to RFC 9293. > > At the end of the patch, tcp_write_queue_purge(sk) is removed because it > was already called in tcp_done_with_error(). > > p.s. This is the same patch with v2. Resent due to mis-labeled "changes > requested" on patchwork.kernel.org. > > Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/ > Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") > Signed-off-by: Xueming Feng <kuro@kuroa.me> > Tested-by: Lorenzo Colitti <lorenzo@google.com> > Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> > Reviewed-by: Eric Dumazet <edumazet@google.com> > Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me > Signed-off-by: Jakub Kicinski <kuba@kernel.org> > Cc: <stable@vger.kernel.org> # v5.10+ Does not apply to 6.1.y or older, what did you want this applied to? thanks, greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort 2025-03-14 12:24 ` Greg KH @ 2025-03-17 4:32 ` Youngmin Nam 0 siblings, 0 replies; 6+ messages in thread From: Youngmin Nam @ 2025-03-17 4:32 UTC (permalink / raw) To: Greg KH Cc: stable, ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms, guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min, hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro, cmllamas, willdeacon, maennich, gregkh, Lorenzo Colitti, Jason Xing, Youngmin Nam [-- Attachment #1: Type: text/plain, Size: 3054 bytes --] On Fri, Mar 14, 2025 at 01:24:26PM +0100, Greg KH wrote: > On Fri, Mar 14, 2025 at 06:24:46PM +0900, Youngmin Nam wrote: > > From: Xueming Feng <kuro@kuroa.me> > > > > commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream. > > > > We have some problem closing zero-window fin-wait-1 tcp sockets in our > > environment. This patch come from the investigation. > > > > Previously tcp_abort only sends out reset and calls tcp_done when the > > socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only > > purging the write queue, but not close the socket and left it to the > > timer. > > > > While purging the write queue, tp->packets_out and sk->sk_write_queue > > is cleared along the way. However tcp_retransmit_timer have early > > return based on !tp->packets_out and tcp_probe_timer have early > > return based on !sk->sk_write_queue. > > > > This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched > > and socket not being killed by the timers, converting a zero-windowed > > orphan into a forever orphan. > > > > This patch removes the SOCK_DEAD check in tcp_abort, making it send > > reset to peer and close the socket accordingly. Preventing the > > timer-less orphan from happening. > > > > According to Lorenzo's email in the v1 thread, the check was there to > > prevent force-closing the same socket twice. That situation is handled > > by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is > > already closed. > > > > The -ENOENT code comes from the associate patch Lorenzo made for > > iproute2-ss; link attached below, which also conform to RFC 9293. > > > > At the end of the patch, tcp_write_queue_purge(sk) is removed because it > > was already called in tcp_done_with_error(). > > > > p.s. This is the same patch with v2. Resent due to mis-labeled "changes > > requested" on patchwork.kernel.org. > > > > Link: https://protect2.fireeye.com/v1/url?k=f1caf90b-ae51376f-f1cb7244-000babda0201-1111684dae24e0cf&q=1&e=32bd2804-1687-48c6-945d-f20eded99c42&u=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Fnetdev%2Fpatch%2F1450773094-7978-3-git-send-email-lorenzo%40google.com%2F > > Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") > > Signed-off-by: Xueming Feng <kuro@kuroa.me> > > Tested-by: Lorenzo Colitti <lorenzo@google.com> > > Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> > > Reviewed-by: Eric Dumazet <edumazet@google.com> > > Link: https://protect2.fireeye.com/v1/url?k=66416ec8-39daa0ac-6640e587-000babda0201-21346ca5121765eb&q=1&e=32bd2804-1687-48c6-945d-f20eded99c42&u=https%3A%2F%2Fpatch.msgid.link%2F20240826102327.1461482-1-kuro%40kuroa.me > > Signed-off-by: Jakub Kicinski <kuba@kernel.org> > > Cc: <stable@vger.kernel.org> # v5.10+ > > Does not apply to 6.1.y or older, what did you want this applied to? > > thanks, > > greg k-h > Hi Greg, Sorry about that. Let me resend these patches for 6.1 and 5.15. As for 5.10, it seems to have more dependencies for the backport. I think the maintainer should handle it to ensure a safe backport. [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] tcp: fix races in tcp_abort() 2025-03-14 9:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Youngmin Nam [not found] ` <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com> @ 2025-03-14 12:24 ` Greg KH 2025-03-17 4:36 ` Youngmin Nam 1 sibling, 1 reply; 6+ messages in thread From: Greg KH @ 2025-03-14 12:24 UTC (permalink / raw) To: Youngmin Nam Cc: stable, ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms, guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min, hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro, cmllamas, willdeacon, maennich, gregkh On Fri, Mar 14, 2025 at 06:24:45PM +0900, Youngmin Nam wrote: > From: Eric Dumazet <edumazet@google.com> > > tcp_abort() has the same issue than the one fixed in the prior patch > in tcp_write_err(). > > commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream. > > To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4, > this patch must be applied first. > > In order to get consistent results from tcp_poll(), we must call > sk_error_report() after tcp_done(). > > We can use tcp_done_with_error() to centralize this logic. > > Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Acked-by: Neal Cardwell <ncardwell@google.com> > Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com > Signed-off-by: Jakub Kicinski <kuba@kernel.org> > Cc: <stable@vger.kernel.org> # v5.10+ Did not apply to 5.10.y, what did you want this added to? thanks, greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] tcp: fix races in tcp_abort() 2025-03-14 12:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Greg KH @ 2025-03-17 4:36 ` Youngmin Nam 0 siblings, 0 replies; 6+ messages in thread From: Youngmin Nam @ 2025-03-17 4:36 UTC (permalink / raw) To: Greg KH Cc: stable, ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms, guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min, hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro, cmllamas, willdeacon, maennich, gregkh, Lorenzo Colitti, Jason Xing, Youngmin Nam [-- Attachment #1: Type: text/plain, Size: 1234 bytes --] On Fri, Mar 14, 2025 at 01:24:09PM +0100, Greg KH wrote: > On Fri, Mar 14, 2025 at 06:24:45PM +0900, Youngmin Nam wrote: > > From: Eric Dumazet <edumazet@google.com> > > > > tcp_abort() has the same issue than the one fixed in the prior patch > > in tcp_write_err(). > > > > commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream. > > > > To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4, > > this patch must be applied first. > > > > In order to get consistent results from tcp_poll(), we must call > > sk_error_report() after tcp_done(). > > > > We can use tcp_done_with_error() to centralize this logic. > > > > Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Acked-by: Neal Cardwell <ncardwell@google.com> > > Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com > > Signed-off-by: Jakub Kicinski <kuba@kernel.org> > > Cc: <stable@vger.kernel.org> # v5.10+ > > Did not apply to 5.10.y, what did you want this added to? > > thanks, > > greg k-h > Hi Greg, Sorry about that. As for 5.10, it seems to have more dependencies for the backport. I think the maintainer should handle it to ensure a safe backport. [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-03-17 4:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20250314092125epcas2p418cd0caeffc32b05fba4fdd2e4ffb9fa@epcas2p4.samsung.com>
2025-03-14 9:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Youngmin Nam
[not found] ` <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com>
2025-03-14 9:24 ` [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort Youngmin Nam
2025-03-14 12:24 ` Greg KH
2025-03-17 4:32 ` Youngmin Nam
2025-03-14 12:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Greg KH
2025-03-17 4:36 ` Youngmin Nam
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).