* TCP's initial cwnd setting correct?...
@ 2007-08-06 12:37 Ilpo Järvinen
2007-08-06 18:31 ` Ilpo Järvinen
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Ilpo Järvinen @ 2007-08-06 12:37 UTC (permalink / raw)
To: Netdev; +Cc: David Miller
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3410 bytes --]
I'm I missing something or why it seems to me that whenever rtt stuff is
reset, also tcp_init_cwnd is not called to do RFC3390-like initial
window setting (remains then in 2 which is set by tcp_v4_init_sock)...
Is this done on purpose or what?
...Another thing that makes me wonder, is the tp->mss_cache > 1460 check
as based on rfc3390 (and also it's precursor rfc2414) with up to 2190
bytes MSS TCP can use 3 as initial cwnd...
...Compile tested patch below in case these are valid concerns... ...Goto
logic could be cleaner (somebody has any suggestion for better way to
structure it?)
--
i.
[PATCH] [TCP]: Use IW also when RTT vars get reset & update to match RFC
Previously when RTT got reset, initial window was not being set.
Initial window defined RFC3390 (and it's precursor 2414) allows
using initial window of three as long as MSS < 2190 bytes.
Also update RFC2414 references as it's obsoleted by RFC3390.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
---
net/ipv4/tcp_input.c | 32 +++++++++++++++++---------------
net/ipv4/tcp_output.c | 2 +-
2 files changed, 18 insertions(+), 16 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index eb96864..89794ec 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -805,13 +805,13 @@ void tcp_update_metrics(struct sock *sk)
}
}
-/* Numbers are taken from RFC2414. */
+/* Numbers are taken from RFC3390. */
__u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst)
{
__u32 cwnd = (dst ? dst_metric(dst, RTAX_INITCWND) : 0);
if (!cwnd) {
- if (tp->mss_cache > 1460)
+ if (tp->mss_cache >= 2190)
cwnd = 2;
else
cwnd = (tp->mss_cache > 1095) ? 3 : 4;
@@ -897,22 +897,24 @@ static void tcp_init_metrics(struct sock *sk)
}
tcp_set_rto(sk);
tcp_bound_rto(sk);
- if (inet_csk(sk)->icsk_rto < TCP_TIMEOUT_INIT && !tp->rx_opt.saw_tstamp)
- goto reset;
- tp->snd_cwnd = tcp_init_cwnd(tp, dst);
- tp->snd_cwnd_stamp = tcp_time_stamp;
- return;
+
+ if (inet_csk(sk)->icsk_rto < TCP_TIMEOUT_INIT && !tp->rx_opt.saw_tstamp) {
reset:
- /* Play conservative. If timestamps are not
- * supported, TCP will fail to recalculate correct
- * rtt, if initial rto is too small. FORGET ALL AND RESET!
- */
- if (!tp->rx_opt.saw_tstamp && tp->srtt) {
- tp->srtt = 0;
- tp->mdev = tp->mdev_max = tp->rttvar = TCP_TIMEOUT_INIT;
- inet_csk(sk)->icsk_rto = TCP_TIMEOUT_INIT;
+ /* Play conservative. If timestamps are not
+ * supported, TCP will fail to recalculate correct
+ * rtt, if initial rto is too small. FORGET ALL AND RESET!
+ */
+ if (!tp->rx_opt.saw_tstamp && tp->srtt) {
+ tp->srtt = 0;
+ tp->mdev = tp->mdev_max = tp->rttvar = TCP_TIMEOUT_INIT;
+ inet_csk(sk)->icsk_rto = TCP_TIMEOUT_INIT;
+ }
}
+
+ tp->snd_cwnd = tcp_init_cwnd(tp, dst);
+ tp->snd_cwnd_stamp = tcp_time_stamp;
+ return;
}
static void tcp_update_reordering(struct sock *sk, const int metric,
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 3abe22e..4f9be23 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -209,7 +209,7 @@ void tcp_select_initial_window(int __space, __u32 mss,
}
/* Set initial window to value enough for senders,
- * following RFC2414. Senders, not following this RFC,
+ * following RFC3390. Senders, not following this RFC,
* will be satisfied with 2.
*/
if (mss > (1<<*rcv_wscale)) {
--
1.5.0.6
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: TCP's initial cwnd setting correct?...
2007-08-06 12:37 TCP's initial cwnd setting correct? Ilpo Järvinen
@ 2007-08-06 18:31 ` Ilpo Järvinen
2007-08-08 15:26 ` John Heffner
2007-08-08 1:28 ` David Miller
2007-08-08 5:01 ` David Miller
2 siblings, 1 reply; 8+ messages in thread
From: Ilpo Järvinen @ 2007-08-06 18:31 UTC (permalink / raw)
To: Netdev; +Cc: David Miller
[-- Attachment #1: Type: TEXT/PLAIN, Size: 263 bytes --]
On Mon, 6 Aug 2007, Ilpo Järvinen wrote:
> ...Goto logic could be cleaner (somebody has any suggestion for better
> way to structure it?)
...I could probably move the setting of snd_cwnd earlier to avoid
this problem if this seems a valid fix at all.
--
i.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TCP's initial cwnd setting correct?...
2007-08-06 18:31 ` Ilpo Järvinen
@ 2007-08-08 15:26 ` John Heffner
0 siblings, 0 replies; 8+ messages in thread
From: John Heffner @ 2007-08-08 15:26 UTC (permalink / raw)
To: Ilpo Järvinen; +Cc: Netdev, David Miller
That sounds right to me.
-John
Ilpo Järvinen wrote:
> On Mon, 6 Aug 2007, Ilpo Järvinen wrote:
>
>> ...Goto logic could be cleaner (somebody has any suggestion for better
>> way to structure it?)
>
> ...I could probably move the setting of snd_cwnd earlier to avoid
> this problem if this seems a valid fix at all.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TCP's initial cwnd setting correct?...
2007-08-06 12:37 TCP's initial cwnd setting correct? Ilpo Järvinen
2007-08-06 18:31 ` Ilpo Järvinen
@ 2007-08-08 1:28 ` David Miller
2007-08-08 15:20 ` John Heffner
2007-08-08 5:01 ` David Miller
2 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2007-08-08 1:28 UTC (permalink / raw)
To: ilpo.jarvinen; +Cc: netdev
From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Mon, 6 Aug 2007 15:37:15 +0300 (EEST)
> @@ -805,13 +805,13 @@ void tcp_update_metrics(struct sock *sk)
> }
> }
>
> -/* Numbers are taken from RFC2414. */
> +/* Numbers are taken from RFC3390. */
> __u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst)
> {
> __u32 cwnd = (dst ? dst_metric(dst, RTAX_INITCWND) : 0);
>
> if (!cwnd) {
> - if (tp->mss_cache > 1460)
> + if (tp->mss_cache >= 2190)
> cwnd = 2;
> else
> cwnd = (tp->mss_cache > 1095) ? 3 : 4;
I remember suggesting something similar about 5 or 6 years
ago and Alexey Kuznetsov at the time explained the numbers
which are there and why they should not be changed.
I forget the reasons though, and I'll try to do the research.
These numbers have been like this forever, FWIW.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: TCP's initial cwnd setting correct?...
2007-08-08 1:28 ` David Miller
@ 2007-08-08 15:20 ` John Heffner
2007-08-09 4:40 ` David Miller
0 siblings, 1 reply; 8+ messages in thread
From: John Heffner @ 2007-08-08 15:20 UTC (permalink / raw)
To: David Miller; +Cc: ilpo.jarvinen, netdev
I believe the current calculation is correct. The RFC specifies a
window of no more than 4380 bytes unless 2*MSS > 4380. If you change
the code in this way, then MSS=1461 will give you an initial window of
3*MSS == 4383, violating the spec. Reading the pseudocode in the RFC
3390 is a bit misleading because they use a clamp at 4380 bytes rather
than use a multiplier in the relevant range.
-John
David Miller wrote:
> From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
> Date: Mon, 6 Aug 2007 15:37:15 +0300 (EEST)
>
>> @@ -805,13 +805,13 @@ void tcp_update_metrics(struct sock *sk)
>> }
>> }
>>
>> -/* Numbers are taken from RFC2414. */
>> +/* Numbers are taken from RFC3390. */
>> __u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst)
>> {
>> __u32 cwnd = (dst ? dst_metric(dst, RTAX_INITCWND) : 0);
>>
>> if (!cwnd) {
>> - if (tp->mss_cache > 1460)
>> + if (tp->mss_cache >= 2190)
>> cwnd = 2;
>> else
>> cwnd = (tp->mss_cache > 1095) ? 3 : 4;
>
> I remember suggesting something similar about 5 or 6 years
> ago and Alexey Kuznetsov at the time explained the numbers
> which are there and why they should not be changed.
>
> I forget the reasons though, and I'll try to do the research.
>
> These numbers have been like this forever, FWIW.
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: TCP's initial cwnd setting correct?...
2007-08-08 15:20 ` John Heffner
@ 2007-08-09 4:40 ` David Miller
0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2007-08-09 4:40 UTC (permalink / raw)
To: jheffner; +Cc: ilpo.jarvinen, netdev
From: John Heffner <jheffner@psc.edu>
Date: Wed, 08 Aug 2007 11:20:05 -0400
> I believe the current calculation is correct. The RFC specifies a
> window of no more than 4380 bytes unless 2*MSS > 4380. If you change
> the code in this way, then MSS=1461 will give you an initial window of
> 3*MSS == 4383, violating the spec. Reading the pseudocode in the RFC
> 3390 is a bit misleading because they use a clamp at 4380 bytes rather
> than use a multiplier in the relevant range.
Thanks for this excellent clarification John.
Because this has tripped up enough people, not once but on
multiple occaisions, I'm going to add some comments to
tcp_init_cwnd() to save the next person who is confused
by what seems to be an incorrect implementation of the RFC :-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TCP's initial cwnd setting correct?...
2007-08-06 12:37 TCP's initial cwnd setting correct? Ilpo Järvinen
2007-08-06 18:31 ` Ilpo Järvinen
2007-08-08 1:28 ` David Miller
@ 2007-08-08 5:01 ` David Miller
2007-08-08 13:05 ` Ilpo Järvinen
2 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2007-08-08 5:01 UTC (permalink / raw)
To: ilpo.jarvinen; +Cc: netdev
From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Mon, 6 Aug 2007 15:37:15 +0300 (EEST)
> ...Another thing that makes me wonder, is the tp->mss_cache > 1460 check
> as based on rfc3390 (and also it's precursor rfc2414) with up to 2190
> bytes MSS TCP can use 3 as initial cwnd...
I did the research and my memory was at least partially right.
Below is an old bogus change of mine and the later revert with
Alexey's explanation.
This seems to be dealing with receive window calculation issues,
rather than snd_cwnd. But they might be related and you should
consider this very seriously.
commit 6b251858d377196b8cea20e65cae60f584a42735
Author: David S. Miller <davem@sunset.davemloft.net>
Date: Wed Sep 28 16:31:48 2005 -0700
[TCP]: Fix init_cwnd calculations in tcp_select_initial_window()
Match it up to what RFC2414 really specifies.
Noticed by Rick Jones.
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index d6e3d26..caf2e2c 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -190,15 +190,16 @@ void tcp_select_initial_window(int __space, __u32 mss,
}
/* Set initial window to value enough for senders,
- * following RFC1414. Senders, not following this RFC,
+ * following RFC2414. Senders, not following this RFC,
* will be satisfied with 2.
*/
if (mss > (1<<*rcv_wscale)) {
- int init_cwnd = 4;
- if (mss > 1460*3)
+ int init_cwnd;
+
+ if (mss > 1460)
init_cwnd = 2;
- else if (mss > 1460)
- init_cwnd = 3;
+ else
+ init_cwnd = (mss > 1095) ? 3 : 4;
if (*rcv_wnd > init_cwnd*mss)
*rcv_wnd = init_cwnd*mss;
}
--------------------
commit 01ff367e62f0474e4d39aa5812cbe2a30d96e1e9
Author: David S. Miller <davem@sunset.davemloft.net>
Date: Thu Sep 29 17:07:20 2005 -0700
[TCP]: Revert 6b251858d377196b8cea20e65cae60f584a42735
But retain the comment fix.
Alexey Kuznetsov has explained the situation as follows:
--------------------
I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is
not continuous: f.e. for mss=1095 it needs initial window 1095*4, but
for mss=1096 it is 1096*3. We do not know exactly what mss sender used
for calculations. If we advertised 1096 (and calculate initial window
3*1096), the sender could limit it to some value < 1096 and then it
will need window his_mss*4 > 3*1096 to send initial burst.
See?
So, the honest function for inital rcv_wnd derived from
tcp_init_cwnd() is:
init_rcv_wnd(mss)=
min { init_cwnd(mss1)*mss1 for mss1 <= mss }
It is something sort of:
if (mss < 1096)
return mss*4;
if (mss < 1096*2)
return 1096*4;
return mss*2;
(I just scrablled a graph of piece of paper, it is difficult to see or
to explain without this)
I selected it differently giving more window than it is strictly
required. Initial receive window must be large enough to allow sender
following to the rfc (or just setting initial cwnd to 2) to send
initial burst. But besides that it is arbitrary, so I decided to give
slack space of one segment.
Actually, the logic was:
If mss is low/normal (<=ethernet), set window to receive more than
initial burst allowed by rfc under the worst conditions
i.e. mss*4. This gives slack space of 1 segment for ethernet frames.
For msses slighlty more than ethernet frame, take 3. Try to give slack
space of 1 frame again.
If mss is huge, force 2*mss. No slack space.
Value 1460*3 is really confusing. Minimal one is 1096*2, but besides
that it is an arbitrary value. It was meant to be ~4096. 1460*3 is
just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so
that I guess hands typed this themselves.
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index caf2e2c..c5b911f 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -194,12 +194,11 @@ void tcp_select_initial_window(int __space, __u32 mss,
* will be satisfied with 2.
*/
if (mss > (1<<*rcv_wscale)) {
- int init_cwnd;
-
- if (mss > 1460)
+ int init_cwnd = 4;
+ if (mss > 1460*3)
init_cwnd = 2;
- else
- init_cwnd = (mss > 1095) ? 3 : 4;
+ else if (mss > 1460)
+ init_cwnd = 3;
if (*rcv_wnd > init_cwnd*mss)
*rcv_wnd = init_cwnd*mss;
}
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: TCP's initial cwnd setting correct?...
2007-08-08 5:01 ` David Miller
@ 2007-08-08 13:05 ` Ilpo Järvinen
0 siblings, 0 replies; 8+ messages in thread
From: Ilpo Järvinen @ 2007-08-08 13:05 UTC (permalink / raw)
To: David Miller; +Cc: Netdev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 994 bytes --]
On Tue, 7 Aug 2007, David Miller wrote:
> From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
> Date: Mon, 6 Aug 2007 15:37:15 +0300 (EEST)
>
> > ...Another thing that makes me wonder, is the tp->mss_cache > 1460 check
> > as based on rfc3390 (and also it's precursor rfc2414) with up to 2190
> > bytes MSS TCP can use 3 as initial cwnd...
>
> I did the research and my memory was at least partially right.
>
> Below is an old bogus change of mine and the later revert with
> Alexey's explanation.
>
> This seems to be dealing with receive window calculation issues,
> rather than snd_cwnd. But they might be related and you should
> consider this very seriously.
...Thanks about the info, I'll study it carefully and see what's
relevant in here too. And anyway we're currently on the safer side
as the potential issues only make it more conservative than the RFC
allows, so there's no hurry to get it in until the solution is
acceptable if there's indeed a need for a change.
--
i.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-08-09 4:40 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-06 12:37 TCP's initial cwnd setting correct? Ilpo Järvinen
2007-08-06 18:31 ` Ilpo Järvinen
2007-08-08 15:26 ` John Heffner
2007-08-08 1:28 ` David Miller
2007-08-08 15:20 ` John Heffner
2007-08-09 4:40 ` David Miller
2007-08-08 5:01 ` David Miller
2007-08-08 13:05 ` Ilpo Järvinen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.