* [PATCH 1/2] tcp: fix races in tcp_abort()
[not found] <CGME20250314092125epcas2p418cd0caeffc32b05fba4fdd2e4ffb9fa@epcas2p4.samsung.com>
@ 2025-03-14 9:24 ` Youngmin Nam
[not found] ` <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com>
2025-03-14 12:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Greg KH
0 siblings, 2 replies; 6+ messages in thread
From: Youngmin Nam @ 2025-03-14 9:24 UTC (permalink / raw)
To: stable
Cc: ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms,
guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min,
hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro,
youngmin.nam, cmllamas, willdeacon, maennich, gregkh
From: Eric Dumazet <edumazet@google.com>
tcp_abort() has the same issue than the one fixed in the prior patch
in tcp_write_err().
commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream.
To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4,
this patch must be applied first.
In order to get consistent results from tcp_poll(), we must call
sk_error_report() after tcp_done().
We can use tcp_done_with_error() to centralize this logic.
Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cc: <stable@vger.kernel.org> # v5.10+
[youngmin: Resolved minor conflict in net/ipv4/tcp.c]
Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>
---
net/ipv4/tcp.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 7ad82be40f34..9fe164aa185c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4630,13 +4630,9 @@ int tcp_abort(struct sock *sk, int err)
bh_lock_sock(sk);
if (!sock_flag(sk, SOCK_DEAD)) {
- WRITE_ONCE(sk->sk_err, err);
- /* This barrier is coupled with smp_rmb() in tcp_poll() */
- smp_wmb();
- sk_error_report(sk);
if (tcp_need_reset(sk->sk_state))
tcp_send_active_reset(sk, GFP_ATOMIC);
- tcp_done(sk);
+ tcp_done_with_error(sk, err);
}
bh_unlock_sock(sk);
--
2.39.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort
[not found] ` <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com>
@ 2025-03-14 9:24 ` Youngmin Nam
2025-03-14 12:24 ` Greg KH
0 siblings, 1 reply; 6+ messages in thread
From: Youngmin Nam @ 2025-03-14 9:24 UTC (permalink / raw)
To: stable
Cc: ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms,
guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min,
hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro,
youngmin.nam, cmllamas, willdeacon, maennich, gregkh,
Lorenzo Colitti, Jason Xing
From: Xueming Feng <kuro@kuroa.me>
commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream.
We have some problem closing zero-window fin-wait-1 tcp sockets in our
environment. This patch come from the investigation.
Previously tcp_abort only sends out reset and calls tcp_done when the
socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only
purging the write queue, but not close the socket and left it to the
timer.
While purging the write queue, tp->packets_out and sk->sk_write_queue
is cleared along the way. However tcp_retransmit_timer have early
return based on !tp->packets_out and tcp_probe_timer have early
return based on !sk->sk_write_queue.
This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched
and socket not being killed by the timers, converting a zero-windowed
orphan into a forever orphan.
This patch removes the SOCK_DEAD check in tcp_abort, making it send
reset to peer and close the socket accordingly. Preventing the
timer-less orphan from happening.
According to Lorenzo's email in the v1 thread, the check was there to
prevent force-closing the same socket twice. That situation is handled
by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is
already closed.
The -ENOENT code comes from the associate patch Lorenzo made for
iproute2-ss; link attached below, which also conform to RFC 9293.
At the end of the patch, tcp_write_queue_purge(sk) is removed because it
was already called in tcp_done_with_error().
p.s. This is the same patch with v2. Resent due to mis-labeled "changes
requested" on patchwork.kernel.org.
Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/
Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.")
Signed-off-by: Xueming Feng <kuro@kuroa.me>
Tested-by: Lorenzo Colitti <lorenzo@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cc: <stable@vger.kernel.org> # v5.10+
Link: https://lore.kernel.org/lkml/Z9OZS%2Fhc+v5og6%2FU@perf/
[youngmin: Resolved minor conflict in net/ipv4/tcp.c]
Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com>
---
net/ipv4/tcp.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9fe164aa185c..ff22060f9145 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4620,6 +4620,13 @@ int tcp_abort(struct sock *sk, int err)
/* Don't race with userspace socket closes such as tcp_close. */
lock_sock(sk);
+ /* Avoid closing the same socket twice. */
+ if (sk->sk_state == TCP_CLOSE) {
+ if (!has_current_bpf_ctx())
+ release_sock(sk);
+ return -ENOENT;
+ }
+
if (sk->sk_state == TCP_LISTEN) {
tcp_set_state(sk, TCP_CLOSE);
inet_csk_listen_stop(sk);
@@ -4629,15 +4636,12 @@ int tcp_abort(struct sock *sk, int err)
local_bh_disable();
bh_lock_sock(sk);
- if (!sock_flag(sk, SOCK_DEAD)) {
- if (tcp_need_reset(sk->sk_state))
- tcp_send_active_reset(sk, GFP_ATOMIC);
- tcp_done_with_error(sk, err);
- }
+ if (tcp_need_reset(sk->sk_state))
+ tcp_send_active_reset(sk, GFP_ATOMIC);
+ tcp_done_with_error(sk, err);
bh_unlock_sock(sk);
local_bh_enable();
- tcp_write_queue_purge(sk);
if (!has_current_bpf_ctx())
release_sock(sk);
return 0;
--
2.39.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] tcp: fix races in tcp_abort()
2025-03-14 9:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Youngmin Nam
[not found] ` <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com>
@ 2025-03-14 12:24 ` Greg KH
2025-03-17 4:36 ` Youngmin Nam
1 sibling, 1 reply; 6+ messages in thread
From: Greg KH @ 2025-03-14 12:24 UTC (permalink / raw)
To: Youngmin Nam
Cc: stable, ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms,
guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min,
hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro,
cmllamas, willdeacon, maennich, gregkh
On Fri, Mar 14, 2025 at 06:24:45PM +0900, Youngmin Nam wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> tcp_abort() has the same issue than the one fixed in the prior patch
> in tcp_write_err().
>
> commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream.
>
> To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4,
> this patch must be applied first.
>
> In order to get consistent results from tcp_poll(), we must call
> sk_error_report() after tcp_done().
>
> We can use tcp_done_with_error() to centralize this logic.
>
> Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Neal Cardwell <ncardwell@google.com>
> Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Cc: <stable@vger.kernel.org> # v5.10+
Did not apply to 5.10.y, what did you want this added to?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort
2025-03-14 9:24 ` [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort Youngmin Nam
@ 2025-03-14 12:24 ` Greg KH
2025-03-17 4:32 ` Youngmin Nam
0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2025-03-14 12:24 UTC (permalink / raw)
To: Youngmin Nam
Cc: stable, ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms,
guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min,
hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro,
cmllamas, willdeacon, maennich, gregkh, Lorenzo Colitti,
Jason Xing
On Fri, Mar 14, 2025 at 06:24:46PM +0900, Youngmin Nam wrote:
> From: Xueming Feng <kuro@kuroa.me>
>
> commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream.
>
> We have some problem closing zero-window fin-wait-1 tcp sockets in our
> environment. This patch come from the investigation.
>
> Previously tcp_abort only sends out reset and calls tcp_done when the
> socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only
> purging the write queue, but not close the socket and left it to the
> timer.
>
> While purging the write queue, tp->packets_out and sk->sk_write_queue
> is cleared along the way. However tcp_retransmit_timer have early
> return based on !tp->packets_out and tcp_probe_timer have early
> return based on !sk->sk_write_queue.
>
> This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched
> and socket not being killed by the timers, converting a zero-windowed
> orphan into a forever orphan.
>
> This patch removes the SOCK_DEAD check in tcp_abort, making it send
> reset to peer and close the socket accordingly. Preventing the
> timer-less orphan from happening.
>
> According to Lorenzo's email in the v1 thread, the check was there to
> prevent force-closing the same socket twice. That situation is handled
> by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is
> already closed.
>
> The -ENOENT code comes from the associate patch Lorenzo made for
> iproute2-ss; link attached below, which also conform to RFC 9293.
>
> At the end of the patch, tcp_write_queue_purge(sk) is removed because it
> was already called in tcp_done_with_error().
>
> p.s. This is the same patch with v2. Resent due to mis-labeled "changes
> requested" on patchwork.kernel.org.
>
> Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/
> Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.")
> Signed-off-by: Xueming Feng <kuro@kuroa.me>
> Tested-by: Lorenzo Colitti <lorenzo@google.com>
> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Cc: <stable@vger.kernel.org> # v5.10+
Does not apply to 6.1.y or older, what did you want this applied to?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort
2025-03-14 12:24 ` Greg KH
@ 2025-03-17 4:32 ` Youngmin Nam
0 siblings, 0 replies; 6+ messages in thread
From: Youngmin Nam @ 2025-03-17 4:32 UTC (permalink / raw)
To: Greg KH
Cc: stable, ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms,
guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min,
hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro,
cmllamas, willdeacon, maennich, gregkh, Lorenzo Colitti,
Jason Xing, Youngmin Nam
[-- Attachment #1: Type: text/plain, Size: 3054 bytes --]
On Fri, Mar 14, 2025 at 01:24:26PM +0100, Greg KH wrote:
> On Fri, Mar 14, 2025 at 06:24:46PM +0900, Youngmin Nam wrote:
> > From: Xueming Feng <kuro@kuroa.me>
> >
> > commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream.
> >
> > We have some problem closing zero-window fin-wait-1 tcp sockets in our
> > environment. This patch come from the investigation.
> >
> > Previously tcp_abort only sends out reset and calls tcp_done when the
> > socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only
> > purging the write queue, but not close the socket and left it to the
> > timer.
> >
> > While purging the write queue, tp->packets_out and sk->sk_write_queue
> > is cleared along the way. However tcp_retransmit_timer have early
> > return based on !tp->packets_out and tcp_probe_timer have early
> > return based on !sk->sk_write_queue.
> >
> > This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched
> > and socket not being killed by the timers, converting a zero-windowed
> > orphan into a forever orphan.
> >
> > This patch removes the SOCK_DEAD check in tcp_abort, making it send
> > reset to peer and close the socket accordingly. Preventing the
> > timer-less orphan from happening.
> >
> > According to Lorenzo's email in the v1 thread, the check was there to
> > prevent force-closing the same socket twice. That situation is handled
> > by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is
> > already closed.
> >
> > The -ENOENT code comes from the associate patch Lorenzo made for
> > iproute2-ss; link attached below, which also conform to RFC 9293.
> >
> > At the end of the patch, tcp_write_queue_purge(sk) is removed because it
> > was already called in tcp_done_with_error().
> >
> > p.s. This is the same patch with v2. Resent due to mis-labeled "changes
> > requested" on patchwork.kernel.org.
> >
> > Link: https://protect2.fireeye.com/v1/url?k=f1caf90b-ae51376f-f1cb7244-000babda0201-1111684dae24e0cf&q=1&e=32bd2804-1687-48c6-945d-f20eded99c42&u=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Fnetdev%2Fpatch%2F1450773094-7978-3-git-send-email-lorenzo%40google.com%2F
> > Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.")
> > Signed-off-by: Xueming Feng <kuro@kuroa.me>
> > Tested-by: Lorenzo Colitti <lorenzo@google.com>
> > Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
> > Reviewed-by: Eric Dumazet <edumazet@google.com>
> > Link: https://protect2.fireeye.com/v1/url?k=66416ec8-39daa0ac-6640e587-000babda0201-21346ca5121765eb&q=1&e=32bd2804-1687-48c6-945d-f20eded99c42&u=https%3A%2F%2Fpatch.msgid.link%2F20240826102327.1461482-1-kuro%40kuroa.me
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > Cc: <stable@vger.kernel.org> # v5.10+
>
> Does not apply to 6.1.y or older, what did you want this applied to?
>
> thanks,
>
> greg k-h
>
Hi Greg,
Sorry about that. Let me resend these patches for 6.1 and 5.15.
As for 5.10, it seems to have more dependencies for the backport.
I think the maintainer should handle it to ensure a safe backport.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] tcp: fix races in tcp_abort()
2025-03-14 12:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Greg KH
@ 2025-03-17 4:36 ` Youngmin Nam
0 siblings, 0 replies; 6+ messages in thread
From: Youngmin Nam @ 2025-03-17 4:36 UTC (permalink / raw)
To: Greg KH
Cc: stable, ncardwell, edumazet, kuba, davem, dsahern, pabeni, horms,
guo88.liu, yiwang.cai, netdev, linux-kernel, joonki.min,
hajun.sung, d7271.choe, sw.ju, dujeong.lee, ycheng, yyd, kuro,
cmllamas, willdeacon, maennich, gregkh, Lorenzo Colitti,
Jason Xing, Youngmin Nam
[-- Attachment #1: Type: text/plain, Size: 1234 bytes --]
On Fri, Mar 14, 2025 at 01:24:09PM +0100, Greg KH wrote:
> On Fri, Mar 14, 2025 at 06:24:45PM +0900, Youngmin Nam wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > tcp_abort() has the same issue than the one fixed in the prior patch
> > in tcp_write_err().
> >
> > commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream.
> >
> > To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4,
> > this patch must be applied first.
> >
> > In order to get consistent results from tcp_poll(), we must call
> > sk_error_report() after tcp_done().
> >
> > We can use tcp_done_with_error() to centralize this logic.
> >
> > Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Acked-by: Neal Cardwell <ncardwell@google.com>
> > Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > Cc: <stable@vger.kernel.org> # v5.10+
>
> Did not apply to 5.10.y, what did you want this added to?
>
> thanks,
>
> greg k-h
>
Hi Greg,
Sorry about that.
As for 5.10, it seems to have more dependencies for the backport.
I think the maintainer should handle it to ensure a safe backport.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-03-17 4:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20250314092125epcas2p418cd0caeffc32b05fba4fdd2e4ffb9fa@epcas2p4.samsung.com>
2025-03-14 9:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Youngmin Nam
[not found] ` <CGME20250314092130epcas2p34e60b23ff983fe03195820a38fb376c5@epcas2p3.samsung.com>
2025-03-14 9:24 ` [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort Youngmin Nam
2025-03-14 12:24 ` Greg KH
2025-03-17 4:32 ` Youngmin Nam
2025-03-14 12:24 ` [PATCH 1/2] tcp: fix races in tcp_abort() Greg KH
2025-03-17 4:36 ` Youngmin Nam
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).