* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries [not found] <bug-18952-10286@https.bugzilla.kernel.org/> @ 2010-09-22 9:02 ` Andrew Morton 2010-09-25 3:05 ` David Miller 0 siblings, 1 reply; 10+ messages in thread From: Andrew Morton @ 2010-09-22 9:02 UTC (permalink / raw) To: netdev; +Cc: bugzilla-daemon, bugme-daemon, yuri (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 22 Sep 2010 08:50:12 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=18952 > > Summary: The mount of SYN retries is not equal to > /proc/sys/net/ipv4/tcp_syn_retries > Product: Networking > Version: 2.5 > Kernel Version: 2.6.32.12, 2.6.32.15, 2.6.35.4 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: IPV4 > AssignedTo: shemminger@linux-foundation.org > ReportedBy: yuri@itinteg.net > Regression: No > > > When setting a value in /proc/sys/net/ipv4/tcp_syn_retries, the actual number > of syn retries is the number set in /proc/sys/net/ipv4/tcp_syn_retries minus > 2. > > For example: > When setting /proc/sys/net/ipv4/tcp_syn_retries to 5 the actual number of SYN > retries is 3. > When setting /proc/sys/net/ipv4/tcp_syn_retries to 7 the actual number of SYN > retries is 5. > However, when setting setting /proc/sys/net/ipv4/tcp_syn_retries to 2 the > actual number of SYN retries is 2. > > Note: In the kernel 2.6.31.9 actual number of SYN = tcp_syn_retries + 1 > > > Steps to reproduce: > sudo iptables -I INPUT 1 -i lo -p tcp --dport 88 -j DROP > sudo tcpdump -n -i lo -v tcp port 88 > > from another console run: > time wget -t 1 -O - --connect-timeout=300 http://0:88 > > tcpdump output: > sudo tcpdump -n -i lo -v tcp port 88 > 11:29:39.820058 IP (tos 0x0, ttl 64, id 14664, offset 0, flags [DF], proto TCP > (6), length 60) > 127.0.0.1.43730 > 127.0.0.1.kerberos: Flags [S], cksum 0xfe30 (incorrect -> > 0xecf4), seq 1012617667, win 32792, options [mss 16396,sackOK,TS val 12871819 > ecr 0,nop,wscale 7], length 0 > 11:29:42.824091 IP (tos 0x0, ttl 64, id 14665, offset 0, flags [DF], proto TCP > (6), length 60) > 127.0.0.1.43730 > 127.0.0.1.kerberos: Flags [S], cksum 0xfe30 (incorrect -> > 0xe137), seq 1012617667, win 32792, options [mss 16396,sackOK,TS val 12874824 > ecr 0,nop,wscale 7], length 0 > 11:29:48.832153 IP (tos 0x0, ttl 64, id 14666, offset 0, flags [DF], proto TCP > (6), length 60) > 127.0.0.1.43730 > 127.0.0.1.kerberos: Flags [S], cksum 0xfe30 (incorrect -> > 0xc9bf), seq 1012617667, win 32792, options [mss 16396,sackOK,TS val 12880832 > ecr 0,nop,wscale 7], length 0 > > wget output: > time wget -t 1 -O - --connect-timeout=300 http://0:88 > Resolving 0... 0.0.0.0 > Connecting to 0|0.0.0.0|:88... failed: Connection timed out. > Giving up. > > > real 0m21.050s > user 0m0.003s > sys 0m0.004s > > It valid for remote host also. > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-22 9:02 ` [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries Andrew Morton @ 2010-09-25 3:05 ` David Miller 2010-09-27 8:07 ` Yuri Chislov 0 siblings, 1 reply; 10+ messages in thread From: David Miller @ 2010-09-25 3:05 UTC (permalink / raw) To: akpm; +Cc: netdev, bugzilla-daemon, bugme-daemon, yuri tcp_syn_retries is not an exact calculation. It is input into a calculation which estimates how long that many retransmits (with suitable backoff applied) will take, and that time estimte in turn determines the time limit for when we'll kill the connection attempt. Feel free to update the documentation in Documentation/networking/ip-sysctl.txt to more closely match the behavior. The logic is in net/ipv4/tcp_timer.c:retransmits_timed_out(). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-25 3:05 ` David Miller @ 2010-09-27 8:07 ` Yuri Chislov 2010-09-27 8:10 ` David Miller 0 siblings, 1 reply; 10+ messages in thread From: Yuri Chislov @ 2010-09-27 8:07 UTC (permalink / raw) To: David Miller; +Cc: akpm, netdev, bugzilla-daemon, bugme-daemon It looks like the behavior changed in 2.6.32. 2.6.32 and up, uses some calculation instead of a direct definition of the retries number, that makes it harder to achieve the necessary system behavior. The default behavior of the system changed completely (the old default connect timeout was ~ 180 seconds, while the new one is ~21 sec). The new behavior invalidates the kernel documentation and tcp man page. It's not possible to set a connect timeout > 25 sec in the applications while using the default values in /proc. >From my view point is regression. On Saturday, September 25, 2010 05:05:57 am David Miller wrote: > tcp_syn_retries is not an exact calculation. > > It is input into a calculation which estimates how long that many > retransmits (with suitable backoff applied) will take, and that time > estimte in turn determines the time limit for when we'll kill the > connection attempt. > > Feel free to update the documentation in > Documentation/networking/ip-sysctl.txt to more closely match the > behavior. > > The logic is in net/ipv4/tcp_timer.c:retransmits_timed_out(). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-27 8:07 ` Yuri Chislov @ 2010-09-27 8:10 ` David Miller 2010-09-27 8:35 ` Damian Lukowski 2010-09-27 20:00 ` Damian Lukowski 0 siblings, 2 replies; 10+ messages in thread From: David Miller @ 2010-09-27 8:10 UTC (permalink / raw) To: yuri; +Cc: akpm, netdev, bugzilla-daemon, bugme-daemon, damian From: Yuri Chislov <yuri@itinteg.net> Date: Mon, 27 Sep 2010 10:07:06 +0200 > It looks like the behavior changed in 2.6.32. 2.6.32 and up, uses some > calculation instead of a direct definition of the retries number, that makes it > harder to achieve the necessary system behavior. > > The default behavior of the system changed completely > (the old default connect timeout was ~ 180 seconds, while the new one is ~21 > sec). > > The new behavior invalidates the kernel documentation and tcp man page. > > It's not possible to set a connect timeout > 25 sec in the applications while > using the default values in /proc. > > From my view point is regression. Agreed, Damian you have to fix this. Otherwise I'm reverting all of your Revert Backoff commits. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-27 8:10 ` David Miller @ 2010-09-27 8:35 ` Damian Lukowski 2010-09-27 20:00 ` Damian Lukowski 1 sibling, 0 replies; 10+ messages in thread From: Damian Lukowski @ 2010-09-27 8:35 UTC (permalink / raw) To: David Miller; +Cc: yuri, akpm, netdev, bugzilla-daemon, bugme-daemon Hi David, give me some time, please. I'll take a closer look in the evening. Regards Damian Am Montag, den 27.09.2010, 01:10 -0700 schrieb David Miller: > From: Yuri Chislov <yuri@itinteg.net> > Date: Mon, 27 Sep 2010 10:07:06 +0200 > > > It looks like the behavior changed in 2.6.32. 2.6.32 and up, uses some > > calculation instead of a direct definition of the retries number, that makes it > > harder to achieve the necessary system behavior. > > > > The default behavior of the system changed completely > > (the old default connect timeout was ~ 180 seconds, while the new one is ~21 > > sec). > > > > The new behavior invalidates the kernel documentation and tcp man page. > > > > It's not possible to set a connect timeout > 25 sec in the applications while > > using the default values in /proc. > > > > From my view point is regression. > > Agreed, Damian you have to fix this. > > Otherwise I'm reverting all of your Revert Backoff commits. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-27 8:10 ` David Miller 2010-09-27 8:35 ` Damian Lukowski @ 2010-09-27 20:00 ` Damian Lukowski 2010-09-28 4:52 ` David Miller 1 sibling, 1 reply; 10+ messages in thread From: Damian Lukowski @ 2010-09-27 20:00 UTC (permalink / raw) To: David Miller; +Cc: yuri, akpm, netdev, bugzilla-daemon, bugme-daemon Ok, I see. When I read your mail this morning, I was afraid, that this is another bug in the calculation routine. However, the routine seems ok to me. The problem is the high discrepancy between the RTO_MIN-based calculation and TCP_TIMEOUT_INIT-based actual backoff in the SYN-case. Yuris example did not reveal a conceptually new bug, as the same behaviour can be observed on pre 2.6.32 kernels at higher values. I tested wget on a 2.6.26 kernel with timeout values of 300 and above. All runs did time out after 189 seconds, with the sysctl-value as their hard limit. My suggestion for solving this issue: Introducing a third boolean parameter for retransmits_timed_out() indicating whether the socket is in SYN state or not. In the SYN case, TCP_TIMEOUT_INIT will be used for the calculation instead of TCP_RTO_MIN. Is that ok? Regards Damian Am Montag, den 27.09.2010, 01:10 -0700 schrieb David Miller: > From: Yuri Chislov <yuri@itinteg.net> > Date: Mon, 27 Sep 2010 10:07:06 +0200 > > > It looks like the behavior changed in 2.6.32. 2.6.32 and up, uses some > > calculation instead of a direct definition of the retries number, that makes it > > harder to achieve the necessary system behavior. > > > > The default behavior of the system changed completely > > (the old default connect timeout was ~ 180 seconds, while the new one is ~21 > > sec). > > > > The new behavior invalidates the kernel documentation and tcp man page. > > > > It's not possible to set a connect timeout > 25 sec in the applications while > > using the default values in /proc. > > > > From my view point is regression. > > Agreed, Damian you have to fix this. > > Otherwise I'm reverting all of your Revert Backoff commits. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-27 20:00 ` Damian Lukowski @ 2010-09-28 4:52 ` David Miller 2010-09-28 7:40 ` Yuri Chislov 0 siblings, 1 reply; 10+ messages in thread From: David Miller @ 2010-09-28 4:52 UTC (permalink / raw) To: damian; +Cc: yuri, akpm, netdev, bugzilla-daemon, bugme-daemon From: Damian Lukowski <damian@tvk.rwth-aachen.de> Date: Mon, 27 Sep 2010 22:00:08 +0200 > My suggestion for solving this issue: > Introducing a third boolean parameter for retransmits_timed_out() > indicating whether the socket is in SYN state or not. In the SYN case, > TCP_TIMEOUT_INIT will be used for the calculation instead of > TCP_RTO_MIN. > > Is that ok? Sounds fine to me, please prepare a patch. Thanks! ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-28 4:52 ` David Miller @ 2010-09-28 7:40 ` Yuri Chislov 2010-09-28 9:47 ` Ilpo Järvinen 0 siblings, 1 reply; 10+ messages in thread From: Yuri Chislov @ 2010-09-28 7:40 UTC (permalink / raw) To: David Miller; +Cc: damian, akpm, netdev, bugzilla-daemon, bugme-daemon What is advantage in the replace compare by calculation? Please clarify, if possible. Thanks. Yuri. --- linux-2.6.31.14/net/ipv4/tcp_timer.c 2010-07-05 17:11:43.000000000 +0000 +++ linux-2.6.32.15/net/ipv4/tcp_timer.c 2010-06-01 16:56:03.000000000 +0000 @@ -137,13 +137,14 @@ { struct inet_connection_sock *icsk = inet_csk(sk); int retry_until; + bool do_reset; if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) { if (icsk->icsk_retransmits) dst_negative_advice(&sk->sk_dst_cache); retry_until = icsk->icsk_syn_retries ? : sysctl_tcp_syn_retries; } else { - if (icsk->icsk_retransmits >= sysctl_tcp_retries1) { + if (retransmits_timed_out(sk, sysctl_tcp_retries1)) { /* Black hole detection */ tcp_mtu_probing(icsk, sk); @@ -155,13 +156,15 @@ const int alive = (icsk->icsk_rto < TCP_RTO_MAX); retry_until = tcp_orphan_retries(sk, alive); + do_reset = alive || + !retransmits_timed_out(sk, retry_until); - if (tcp_out_of_resources(sk, alive || icsk- >icsk_retransmits < retry_until)) + if (tcp_out_of_resources(sk, do_reset)) return 1; } } - if (icsk->icsk_retransmits >= retry_until) { + if (retransmits_timed_out(sk, retry_until)) { /* Has it gone just too far? */ tcp_write_err(sk); return 1; @@ -279,7 +282,7 @@ * The TCP retransmit timer. */ -static void tcp_retransmit_timer(struct sock *sk) +void tcp_retransmit_timer(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); struct inet_connection_sock *icsk = inet_csk(sk); @@ -385,7 +388,7 @@ out_reset_timer: icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX); inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, icsk->icsk_rto, TCP_RTO_MAX); - if (icsk->icsk_retransmits > sysctl_tcp_retries1) + if (retransmits_timed_out(sk, sysctl_tcp_retries1 + 1)) __sk_dst_reset(sk); out:; @@ -499,8 +502,7 @@ elapsed = tcp_time_stamp - tp->rcv_tstamp; if (elapsed >= keepalive_time_when(tp)) { - if ((!tp->keepalive_probes && icsk->icsk_probes_out >= sysctl_tcp_keepalive_probes) || - (tp->keepalive_probes && icsk->icsk_probes_out >= tp- >keepalive_probes)) { + if (icsk->icsk_probes_out >= keepalive_probes(tp)) { tcp_send_active_reset(sk, GFP_ATOMIC); tcp_write_err(sk); goto out; On Tuesday, September 28, 2010 06:52:41 am David Miller wrote: > From: Damian Lukowski <damian@tvk.rwth-aachen.de> > Date: Mon, 27 Sep 2010 22:00:08 +0200 > > > My suggestion for solving this issue: > > Introducing a third boolean parameter for retransmits_timed_out() > > indicating whether the socket is in SYN state or not. In the SYN case, > > TCP_TIMEOUT_INIT will be used for the calculation instead of > > TCP_RTO_MIN. > > > > Is that ok? > > Sounds fine to me, please prepare a patch. > > Thanks! ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-28 7:40 ` Yuri Chislov @ 2010-09-28 9:47 ` Ilpo Järvinen 2010-09-28 12:08 ` Yuri Chislov 0 siblings, 1 reply; 10+ messages in thread From: Ilpo Järvinen @ 2010-09-28 9:47 UTC (permalink / raw) To: Yuri Chislov Cc: David Miller, damian, Andrew Morton, Netdev, bugzilla-daemon, bugme-daemon On Tue, 28 Sep 2010, Yuri Chislov wrote: > What is advantage in the replace compare by calculation? > Please clarify, if possible. > Thanks. > Yuri. If you did take this from kernel history, reading the particular log message might have helped?!? > --- linux-2.6.31.14/net/ipv4/tcp_timer.c 2010-07-05 17:11:43.000000000 > +0000 > +++ linux-2.6.32.15/net/ipv4/tcp_timer.c 2010-06-01 16:56:03.000000000 > +0000 > @@ -137,13 +137,14 @@ > { > struct inet_connection_sock *icsk = inet_csk(sk); > int retry_until; > + bool do_reset; > > if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) { > if (icsk->icsk_retransmits) > dst_negative_advice(&sk->sk_dst_cache); > retry_until = icsk->icsk_syn_retries ? : > sysctl_tcp_syn_retries; > } else { > - if (icsk->icsk_retransmits >= sysctl_tcp_retries1) { > + if (retransmits_timed_out(sk, sysctl_tcp_retries1)) { > /* Black hole detection */ > tcp_mtu_probing(icsk, sk); > > @@ -155,13 +156,15 @@ > const int alive = (icsk->icsk_rto < TCP_RTO_MAX); > > retry_until = tcp_orphan_retries(sk, alive); > + do_reset = alive || > + !retransmits_timed_out(sk, retry_until); > > - if (tcp_out_of_resources(sk, alive || icsk- > >icsk_retransmits < retry_until)) > + if (tcp_out_of_resources(sk, do_reset)) > return 1; > } > } > > - if (icsk->icsk_retransmits >= retry_until) { > + if (retransmits_timed_out(sk, retry_until)) { > /* Has it gone just too far? */ > tcp_write_err(sk); > return 1; > @@ -279,7 +282,7 @@ > * The TCP retransmit timer. > */ > > -static void tcp_retransmit_timer(struct sock *sk) > +void tcp_retransmit_timer(struct sock *sk) > { > struct tcp_sock *tp = tcp_sk(sk); > struct inet_connection_sock *icsk = inet_csk(sk); > @@ -385,7 +388,7 @@ > out_reset_timer: > icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX); > inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, icsk->icsk_rto, > TCP_RTO_MAX); > - if (icsk->icsk_retransmits > sysctl_tcp_retries1) > + if (retransmits_timed_out(sk, sysctl_tcp_retries1 + 1)) > __sk_dst_reset(sk); > > out:; > @@ -499,8 +502,7 @@ > elapsed = tcp_time_stamp - tp->rcv_tstamp; > > if (elapsed >= keepalive_time_when(tp)) { > - if ((!tp->keepalive_probes && icsk->icsk_probes_out >= > sysctl_tcp_keepalive_probes) || > - (tp->keepalive_probes && icsk->icsk_probes_out >= tp- > >keepalive_probes)) { > + if (icsk->icsk_probes_out >= keepalive_probes(tp)) { > tcp_send_active_reset(sk, GFP_ATOMIC); > tcp_write_err(sk); -- i. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries 2010-09-28 9:47 ` Ilpo Järvinen @ 2010-09-28 12:08 ` Yuri Chislov 0 siblings, 0 replies; 10+ messages in thread From: Yuri Chislov @ 2010-09-28 12:08 UTC (permalink / raw) To: Ilpo Järvinen Cc: David Miller, damian, Andrew Morton, Netdev, bugzilla-daemon, bugme-daemon Thank you. The log is really helpful. On Tuesday, September 28, 2010 11:47:23 am Ilpo Järvinen wrote: > On Tue, 28 Sep 2010, Yuri Chislov wrote: > > What is advantage in the replace compare by calculation? > > > > Please clarify, if possible. > > Thanks. > > Yuri. > > If you did take this from kernel history, reading the particular log > message might have helped?!? ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-09-28 12:07 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-18952-10286@https.bugzilla.kernel.org/>
2010-09-22 9:02 ` [Bugme-new] [Bug 18952] New: The mount of SYN retries is not equal to /proc/sys/net/ipv4/tcp_syn_retries Andrew Morton
2010-09-25 3:05 ` David Miller
2010-09-27 8:07 ` Yuri Chislov
2010-09-27 8:10 ` David Miller
2010-09-27 8:35 ` Damian Lukowski
2010-09-27 20:00 ` Damian Lukowski
2010-09-28 4:52 ` David Miller
2010-09-28 7:40 ` Yuri Chislov
2010-09-28 9:47 ` Ilpo Järvinen
2010-09-28 12:08 ` Yuri Chislov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).