* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
[not found] <bug-18952-10286@https.bugzilla.kernel.org/>
@ 2010-09-22 9:02 ` Andrew Morton
2010-09-25 3:05 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2010-09-22 9:02 UTC (permalink / raw)
To: netdev; +Cc: bugzilla-daemon, bugme-daemon, yuri
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Wed, 22 Sep 2010 08:50:12 GMT bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=18952
>
> Summary: The mount of SYN retries is not equal to
> /proc/sys/net/ipv4/tcp_syn_retries
> Product: Networking
> Version: 2.5
> Kernel Version: 2.6.32.12, 2.6.32.15, 2.6.35.4
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: IPV4
> AssignedTo: shemminger@linux-foundation.org
> ReportedBy: yuri@itinteg.net
> Regression: No
>
>
> When setting a value in /proc/sys/net/ipv4/tcp_syn_retries, the actual number
> of syn retries is the number set in /proc/sys/net/ipv4/tcp_syn_retries minus
> 2.
>
> For example:
> When setting /proc/sys/net/ipv4/tcp_syn_retries to 5 the actual number of SYN
> retries is 3.
> When setting /proc/sys/net/ipv4/tcp_syn_retries to 7 the actual number of SYN
> retries is 5.
> However, when setting setting /proc/sys/net/ipv4/tcp_syn_retries to 2 the
> actual number of SYN retries is 2.
>
> Note: In the kernel 2.6.31.9 actual number of SYN = tcp_syn_retries + 1
>
>
> Steps to reproduce:
> sudo iptables -I INPUT 1 -i lo -p tcp --dport 88 -j DROP
> sudo tcpdump -n -i lo -v tcp port 88
>
> from another console run:
> time wget -t 1 -O - --connect-timeout=300 http://0:88
>
> tcpdump output:
> sudo tcpdump -n -i lo -v tcp port 88
> 11:29:39.820058 IP (tos 0x0, ttl 64, id 14664, offset 0, flags [DF], proto TCP
> (6), length 60)
> 127.0.0.1.43730 > 127.0.0.1.kerberos: Flags [S], cksum 0xfe30 (incorrect ->
> 0xecf4), seq 1012617667, win 32792, options [mss 16396,sackOK,TS val 12871819
> ecr 0,nop,wscale 7], length 0
> 11:29:42.824091 IP (tos 0x0, ttl 64, id 14665, offset 0, flags [DF], proto TCP
> (6), length 60)
> 127.0.0.1.43730 > 127.0.0.1.kerberos: Flags [S], cksum 0xfe30 (incorrect ->
> 0xe137), seq 1012617667, win 32792, options [mss 16396,sackOK,TS val 12874824
> ecr 0,nop,wscale 7], length 0
> 11:29:48.832153 IP (tos 0x0, ttl 64, id 14666, offset 0, flags [DF], proto TCP
> (6), length 60)
> 127.0.0.1.43730 > 127.0.0.1.kerberos: Flags [S], cksum 0xfe30 (incorrect ->
> 0xc9bf), seq 1012617667, win 32792, options [mss 16396,sackOK,TS val 12880832
> ecr 0,nop,wscale 7], length 0
>
> wget output:
> time wget -t 1 -O - --connect-timeout=300 http://0:88
> Resolving 0... 0.0.0.0
> Connecting to 0|0.0.0.0|:88... failed: Connection timed out.
> Giving up.
>
>
> real 0m21.050s
> user 0m0.003s
> sys 0m0.004s
>
> It valid for remote host also.
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-22 9:02 ` [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries Andrew Morton
@ 2010-09-25 3:05 ` David Miller
2010-09-27 8:07 ` Yuri Chislov
0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2010-09-25 3:05 UTC (permalink / raw)
To: akpm; +Cc: netdev, bugzilla-daemon, bugme-daemon, yuri
tcp_syn_retries is not an exact calculation.
It is input into a calculation which estimates how long that many
retransmits (with suitable backoff applied) will take, and that time
estimte in turn determines the time limit for when we'll kill the
connection attempt.
Feel free to update the documentation in
Documentation/networking/ip-sysctl.txt to more closely match the
behavior.
The logic is in net/ipv4/tcp_timer.c:retransmits_timed_out().
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-25 3:05 ` David Miller
@ 2010-09-27 8:07 ` Yuri Chislov
2010-09-27 8:10 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Yuri Chislov @ 2010-09-27 8:07 UTC (permalink / raw)
To: David Miller; +Cc: akpm, netdev, bugzilla-daemon, bugme-daemon
It looks like the behavior changed in 2.6.32. 2.6.32 and up, uses some
calculation instead of a direct definition of the retries number, that makes it
harder to achieve the necessary system behavior.
The default behavior of the system changed completely
(the old default connect timeout was ~ 180 seconds, while the new one is ~21
sec).
The new behavior invalidates the kernel documentation and tcp man page.
It's not possible to set a connect timeout > 25 sec in the applications while
using the default values in /proc.
>From my view point is regression.
On Saturday, September 25, 2010 05:05:57 am David Miller wrote:
> tcp_syn_retries is not an exact calculation.
>
> It is input into a calculation which estimates how long that many
> retransmits (with suitable backoff applied) will take, and that time
> estimte in turn determines the time limit for when we'll kill the
> connection attempt.
>
> Feel free to update the documentation in
> Documentation/networking/ip-sysctl.txt to more closely match the
> behavior.
>
> The logic is in net/ipv4/tcp_timer.c:retransmits_timed_out().
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-27 8:07 ` Yuri Chislov
@ 2010-09-27 8:10 ` David Miller
2010-09-27 8:35 ` Damian Lukowski
2010-09-27 20:00 ` Damian Lukowski
0 siblings, 2 replies; 10+ messages in thread
From: David Miller @ 2010-09-27 8:10 UTC (permalink / raw)
To: yuri; +Cc: akpm, netdev, bugzilla-daemon, bugme-daemon, damian
From: Yuri Chislov <yuri@itinteg.net>
Date: Mon, 27 Sep 2010 10:07:06 +0200
> It looks like the behavior changed in 2.6.32. 2.6.32 and up, uses some
> calculation instead of a direct definition of the retries number, that makes it
> harder to achieve the necessary system behavior.
>
> The default behavior of the system changed completely
> (the old default connect timeout was ~ 180 seconds, while the new one is ~21
> sec).
>
> The new behavior invalidates the kernel documentation and tcp man page.
>
> It's not possible to set a connect timeout > 25 sec in the applications while
> using the default values in /proc.
>
> From my view point is regression.
Agreed, Damian you have to fix this.
Otherwise I'm reverting all of your Revert Backoff commits.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-27 8:10 ` David Miller
@ 2010-09-27 8:35 ` Damian Lukowski
2010-09-27 20:00 ` Damian Lukowski
1 sibling, 0 replies; 10+ messages in thread
From: Damian Lukowski @ 2010-09-27 8:35 UTC (permalink / raw)
To: David Miller; +Cc: yuri, akpm, netdev, bugzilla-daemon, bugme-daemon
Hi David,
give me some time, please. I'll take a closer look in the evening.
Regards
Damian
Am Montag, den 27.09.2010, 01:10 -0700 schrieb David Miller:
> From: Yuri Chislov <yuri@itinteg.net>
> Date: Mon, 27 Sep 2010 10:07:06 +0200
>
> > It looks like the behavior changed in 2.6.32. 2.6.32 and up, uses some
> > calculation instead of a direct definition of the retries number, that makes it
> > harder to achieve the necessary system behavior.
> >
> > The default behavior of the system changed completely
> > (the old default connect timeout was ~ 180 seconds, while the new one is ~21
> > sec).
> >
> > The new behavior invalidates the kernel documentation and tcp man page.
> >
> > It's not possible to set a connect timeout > 25 sec in the applications while
> > using the default values in /proc.
> >
> > From my view point is regression.
>
> Agreed, Damian you have to fix this.
>
> Otherwise I'm reverting all of your Revert Backoff commits.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-27 8:10 ` David Miller
2010-09-27 8:35 ` Damian Lukowski
@ 2010-09-27 20:00 ` Damian Lukowski
2010-09-28 4:52 ` David Miller
1 sibling, 1 reply; 10+ messages in thread
From: Damian Lukowski @ 2010-09-27 20:00 UTC (permalink / raw)
To: David Miller; +Cc: yuri, akpm, netdev, bugzilla-daemon, bugme-daemon
Ok, I see.
When I read your mail this morning, I was afraid, that this is another
bug in the calculation routine. However, the routine seems ok to me.
The problem is the high discrepancy between the RTO_MIN-based
calculation and TCP_TIMEOUT_INIT-based actual backoff in the SYN-case.
Yuris example did not reveal a conceptually new bug, as the same
behaviour can be observed on pre 2.6.32 kernels at higher values.
I tested wget on a 2.6.26 kernel with timeout values of 300 and above.
All runs did time out after 189 seconds, with the sysctl-value as their
hard limit.
My suggestion for solving this issue:
Introducing a third boolean parameter for retransmits_timed_out()
indicating whether the socket is in SYN state or not. In the SYN case,
TCP_TIMEOUT_INIT will be used for the calculation instead of
TCP_RTO_MIN.
Is that ok?
Regards
Damian
Am Montag, den 27.09.2010, 01:10 -0700 schrieb David Miller:
> From: Yuri Chislov <yuri@itinteg.net>
> Date: Mon, 27 Sep 2010 10:07:06 +0200
>
> > It looks like the behavior changed in 2.6.32. 2.6.32 and up, uses some
> > calculation instead of a direct definition of the retries number, that makes it
> > harder to achieve the necessary system behavior.
> >
> > The default behavior of the system changed completely
> > (the old default connect timeout was ~ 180 seconds, while the new one is ~21
> > sec).
> >
> > The new behavior invalidates the kernel documentation and tcp man page.
> >
> > It's not possible to set a connect timeout > 25 sec in the applications while
> > using the default values in /proc.
> >
> > From my view point is regression.
>
> Agreed, Damian you have to fix this.
>
> Otherwise I'm reverting all of your Revert Backoff commits.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-27 20:00 ` Damian Lukowski
@ 2010-09-28 4:52 ` David Miller
2010-09-28 7:40 ` Yuri Chislov
0 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2010-09-28 4:52 UTC (permalink / raw)
To: damian; +Cc: yuri, akpm, netdev, bugzilla-daemon, bugme-daemon
From: Damian Lukowski <damian@tvk.rwth-aachen.de>
Date: Mon, 27 Sep 2010 22:00:08 +0200
> My suggestion for solving this issue:
> Introducing a third boolean parameter for retransmits_timed_out()
> indicating whether the socket is in SYN state or not. In the SYN case,
> TCP_TIMEOUT_INIT will be used for the calculation instead of
> TCP_RTO_MIN.
>
> Is that ok?
Sounds fine to me, please prepare a patch.
Thanks!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-28 4:52 ` David Miller
@ 2010-09-28 7:40 ` Yuri Chislov
2010-09-28 9:47 ` Ilpo Järvinen
0 siblings, 1 reply; 10+ messages in thread
From: Yuri Chislov @ 2010-09-28 7:40 UTC (permalink / raw)
To: David Miller; +Cc: damian, akpm, netdev, bugzilla-daemon, bugme-daemon
What is advantage in the replace compare by calculation?
Please clarify, if possible.
Thanks.
Yuri.
--- linux-2.6.31.14/net/ipv4/tcp_timer.c 2010-07-05 17:11:43.000000000
+0000
+++ linux-2.6.32.15/net/ipv4/tcp_timer.c 2010-06-01 16:56:03.000000000
+0000
@@ -137,13 +137,14 @@
{
struct inet_connection_sock *icsk = inet_csk(sk);
int retry_until;
+ bool do_reset;
if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
if (icsk->icsk_retransmits)
dst_negative_advice(&sk->sk_dst_cache);
retry_until = icsk->icsk_syn_retries ? :
sysctl_tcp_syn_retries;
} else {
- if (icsk->icsk_retransmits >= sysctl_tcp_retries1) {
+ if (retransmits_timed_out(sk, sysctl_tcp_retries1)) {
/* Black hole detection */
tcp_mtu_probing(icsk, sk);
@@ -155,13 +156,15 @@
const int alive = (icsk->icsk_rto < TCP_RTO_MAX);
retry_until = tcp_orphan_retries(sk, alive);
+ do_reset = alive ||
+ !retransmits_timed_out(sk, retry_until);
- if (tcp_out_of_resources(sk, alive || icsk-
>icsk_retransmits < retry_until))
+ if (tcp_out_of_resources(sk, do_reset))
return 1;
}
}
- if (icsk->icsk_retransmits >= retry_until) {
+ if (retransmits_timed_out(sk, retry_until)) {
/* Has it gone just too far? */
tcp_write_err(sk);
return 1;
@@ -279,7 +282,7 @@
* The TCP retransmit timer.
*/
-static void tcp_retransmit_timer(struct sock *sk)
+void tcp_retransmit_timer(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
struct inet_connection_sock *icsk = inet_csk(sk);
@@ -385,7 +388,7 @@
out_reset_timer:
icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX);
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, icsk->icsk_rto,
TCP_RTO_MAX);
- if (icsk->icsk_retransmits > sysctl_tcp_retries1)
+ if (retransmits_timed_out(sk, sysctl_tcp_retries1 + 1))
__sk_dst_reset(sk);
out:;
@@ -499,8 +502,7 @@
elapsed = tcp_time_stamp - tp->rcv_tstamp;
if (elapsed >= keepalive_time_when(tp)) {
- if ((!tp->keepalive_probes && icsk->icsk_probes_out >=
sysctl_tcp_keepalive_probes) ||
- (tp->keepalive_probes && icsk->icsk_probes_out >= tp-
>keepalive_probes)) {
+ if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
tcp_send_active_reset(sk, GFP_ATOMIC);
tcp_write_err(sk);
goto out;
On Tuesday, September 28, 2010 06:52:41 am David Miller wrote:
> From: Damian Lukowski <damian@tvk.rwth-aachen.de>
> Date: Mon, 27 Sep 2010 22:00:08 +0200
>
> > My suggestion for solving this issue:
> > Introducing a third boolean parameter for retransmits_timed_out()
> > indicating whether the socket is in SYN state or not. In the SYN case,
> > TCP_TIMEOUT_INIT will be used for the calculation instead of
> > TCP_RTO_MIN.
> >
> > Is that ok?
>
> Sounds fine to me, please prepare a patch.
>
> Thanks!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-28 7:40 ` Yuri Chislov
@ 2010-09-28 9:47 ` Ilpo Järvinen
2010-09-28 12:08 ` Yuri Chislov
0 siblings, 1 reply; 10+ messages in thread
From: Ilpo Järvinen @ 2010-09-28 9:47 UTC (permalink / raw)
To: Yuri Chislov
Cc: David Miller, damian, Andrew Morton, Netdev, bugzilla-daemon,
bugme-daemon
On Tue, 28 Sep 2010, Yuri Chislov wrote:
> What is advantage in the replace compare by calculation?
> Please clarify, if possible.
> Thanks.
> Yuri.
If you did take this from kernel history, reading the particular log
message might have helped?!?
> --- linux-2.6.31.14/net/ipv4/tcp_timer.c 2010-07-05 17:11:43.000000000
> +0000
> +++ linux-2.6.32.15/net/ipv4/tcp_timer.c 2010-06-01 16:56:03.000000000
> +0000
> @@ -137,13 +137,14 @@
> {
> struct inet_connection_sock *icsk = inet_csk(sk);
> int retry_until;
> + bool do_reset;
>
> if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
> if (icsk->icsk_retransmits)
> dst_negative_advice(&sk->sk_dst_cache);
> retry_until = icsk->icsk_syn_retries ? :
> sysctl_tcp_syn_retries;
> } else {
> - if (icsk->icsk_retransmits >= sysctl_tcp_retries1) {
> + if (retransmits_timed_out(sk, sysctl_tcp_retries1)) {
> /* Black hole detection */
> tcp_mtu_probing(icsk, sk);
>
> @@ -155,13 +156,15 @@
> const int alive = (icsk->icsk_rto < TCP_RTO_MAX);
>
> retry_until = tcp_orphan_retries(sk, alive);
> + do_reset = alive ||
> + !retransmits_timed_out(sk, retry_until);
>
> - if (tcp_out_of_resources(sk, alive || icsk-
> >icsk_retransmits < retry_until))
> + if (tcp_out_of_resources(sk, do_reset))
> return 1;
> }
> }
>
> - if (icsk->icsk_retransmits >= retry_until) {
> + if (retransmits_timed_out(sk, retry_until)) {
> /* Has it gone just too far? */
> tcp_write_err(sk);
> return 1;
> @@ -279,7 +282,7 @@
> * The TCP retransmit timer.
> */
>
> -static void tcp_retransmit_timer(struct sock *sk)
> +void tcp_retransmit_timer(struct sock *sk)
> {
> struct tcp_sock *tp = tcp_sk(sk);
> struct inet_connection_sock *icsk = inet_csk(sk);
> @@ -385,7 +388,7 @@
> out_reset_timer:
> icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX);
> inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, icsk->icsk_rto,
> TCP_RTO_MAX);
> - if (icsk->icsk_retransmits > sysctl_tcp_retries1)
> + if (retransmits_timed_out(sk, sysctl_tcp_retries1 + 1))
> __sk_dst_reset(sk);
>
> out:;
> @@ -499,8 +502,7 @@
> elapsed = tcp_time_stamp - tp->rcv_tstamp;
>
> if (elapsed >= keepalive_time_when(tp)) {
> - if ((!tp->keepalive_probes && icsk->icsk_probes_out >=
> sysctl_tcp_keepalive_probes) ||
> - (tp->keepalive_probes && icsk->icsk_probes_out >= tp-
> >keepalive_probes)) {
> + if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
> tcp_send_active_reset(sk, GFP_ATOMIC);
> tcp_write_err(sk);
--
i.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries
2010-09-28 9:47 ` Ilpo Järvinen
@ 2010-09-28 12:08 ` Yuri Chislov
0 siblings, 0 replies; 10+ messages in thread
From: Yuri Chislov @ 2010-09-28 12:08 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: David Miller, damian, Andrew Morton, Netdev, bugzilla-daemon,
bugme-daemon
Thank you. The log is really helpful.
On Tuesday, September 28, 2010 11:47:23 am Ilpo Järvinen wrote:
> On Tue, 28 Sep 2010, Yuri Chislov wrote:
> > What is advantage in the replace compare by calculation?
> >
> > Please clarify, if possible.
> > Thanks.
> > Yuri.
>
> If you did take this from kernel history, reading the particular log
> message might have helped?!?
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-09-28 12:07 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-18952-10286@https.bugzilla.kernel.org/>
2010-09-22 9:02 ` [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries Andrew Morton
2010-09-25 3:05 ` David Miller
2010-09-27 8:07 ` Yuri Chislov
2010-09-27 8:10 ` David Miller
2010-09-27 8:35 ` Damian Lukowski
2010-09-27 20:00 ` Damian Lukowski
2010-09-28 4:52 ` David Miller
2010-09-28 7:40 ` Yuri Chislov
2010-09-28 9:47 ` Ilpo Järvinen
2010-09-28 12:08 ` Yuri Chislov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).