* tune back idle cwnd closing?
@ 2006-04-21 19:58 Zach Brown
2006-04-25 14:27 ` John Heffner
0 siblings, 1 reply; 11+ messages in thread
From: Zach Brown @ 2006-04-21 19:58 UTC (permalink / raw)
To: netdev
My apologies if this is a FAQ, I couldn't find it in the archives.
We have some dudes who are syncing large amounts of data across a
dedicated long fat pipe at somewhat irregular intervals that are, sadly,
longer than the rto. They feel the pain of having to reopen the window
between transmissions.
Is there room for a compromise tunable that would be less aggressive
about closing cwnd during idle periods but which wouldn't violate the
spirit of 2861? No one wants broken TCP here.
They mention that Solaris has the tcp_slow_start_after_idle tunable and
that it helps their situation. I mention that only as a data point, I
wouldn't be foolish enough to try and use the presence of something in
Solaris as justification :)
- z
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-21 19:58 tune back idle cwnd closing? Zach Brown
@ 2006-04-25 14:27 ` John Heffner
2006-04-26 21:45 ` David S. Miller
0 siblings, 1 reply; 11+ messages in thread
From: John Heffner @ 2006-04-25 14:27 UTC (permalink / raw)
To: Zach Brown; +Cc: netdev
Zach Brown wrote:
> My apologies if this is a FAQ, I couldn't find it in the archives.
>
> We have some dudes who are syncing large amounts of data across a
> dedicated long fat pipe at somewhat irregular intervals that are, sadly,
> longer than the rto. They feel the pain of having to reopen the window
> between transmissions.
>
> Is there room for a compromise tunable that would be less aggressive
> about closing cwnd during idle periods but which wouldn't violate the
> spirit of 2861? No one wants broken TCP here.
>
> They mention that Solaris has the tcp_slow_start_after_idle tunable and
> that it helps their situation. I mention that only as a data point, I
> wouldn't be foolish enough to try and use the presence of something in
> Solaris as justification :)
Yours is the first complaint of this kind I recall seeing, but I've
expected for a while someone would have this type of problem. RFC2861
seems conceptually nice at first, but there are a few things about it
that bother me. One thing in particular is that a naturally bursty
application (like yours) will actually perform better by padding its
connection with junk data whenever it doesn't have real data to send.
Or equivalently, it's punished for not sending data when it doesn't need
to. I also think it may not do much good when there are connections
with significantly different RTTs.
Given that RFC2681 is Experimental (and I'm not aware of any current
efforts in the IETF to push it to the standard track), IHMO it would not
be inappropriate to make this behavior controlled via sysctl.
Thanks,
-John
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-25 14:27 ` John Heffner
@ 2006-04-26 21:45 ` David S. Miller
2006-04-26 22:16 ` Rick Jones
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: David S. Miller @ 2006-04-26 21:45 UTC (permalink / raw)
To: jheffner; +Cc: zach.brown, netdev
From: John Heffner <jheffner@psc.edu>
Date: Tue, 25 Apr 2006 10:27:37 -0400
> Yours is the first complaint of this kind I recall seeing, but I've
> expected for a while someone would have this type of problem. RFC2861
> seems conceptually nice at first, but there are a few things about it
> that bother me. One thing in particular is that a naturally bursty
> application (like yours) will actually perform better by padding its
> connection with junk data whenever it doesn't have real data to send.
> Or equivalently, it's punished for not sending data when it doesn't need
> to. I also think it may not do much good when there are connections
> with significantly different RTTs.
>
> Given that RFC2681 is Experimental (and I'm not aware of any current
> efforts in the IETF to push it to the standard track), IHMO it would not
> be inappropriate to make this behavior controlled via sysctl.
I have to respectfully disagree.
This is the price you pay when the network's congestion is being
measured by probing, information becomes stale over time if you don't
send any probes.
And this change of congestion state is real and happens frequently for
most end to end users.
When you're bursty application is not sending, other flows can take up
the pipe space you are not using, and you must reprobe to figure that
out.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-26 21:45 ` David S. Miller
@ 2006-04-26 22:16 ` Rick Jones
2006-04-26 22:27 ` Stephen Hemminger
2006-04-26 22:33 ` David S. Miller
2006-04-26 23:25 ` Zach Brown
2006-04-27 17:47 ` John Heffner
2 siblings, 2 replies; 11+ messages in thread
From: Rick Jones @ 2006-04-26 22:16 UTC (permalink / raw)
To: David S. Miller; +Cc: jheffner, zach.brown, netdev
> When you're bursty application is not sending, other flows can take up
> the pipe space you are not using, and you must reprobe to figure that
> out.
If the "restarted" connection does normal slow-start, one of two things
will happen yes? Either it will grow its cwnd to >= the receiver's
window, or it will have to stop before then because it triggered a
packet loss.
In the first case, seems it would have been just as good to let the
connection burst.
In the second case, is the effect on other connections really any better
than if the connection just started-up from where it was before?
BTW, is the RFC 2681? I looked that one up on ietf.org and the RFC by
that number was a different beast entirely - at least at a very quick
glance.
rick jones
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-26 22:16 ` Rick Jones
@ 2006-04-26 22:27 ` Stephen Hemminger
2006-04-26 22:44 ` Rick Jones
2006-04-26 22:33 ` David S. Miller
1 sibling, 1 reply; 11+ messages in thread
From: Stephen Hemminger @ 2006-04-26 22:27 UTC (permalink / raw)
To: Rick Jones; +Cc: David S. Miller, jheffner, zach.brown, netdev
On Wed, 26 Apr 2006 15:16:18 -0700
Rick Jones <rick.jones2@hp.com> wrote:
> > When you're bursty application is not sending, other flows can take up
> > the pipe space you are not using, and you must reprobe to figure that
> > out.
>
> If the "restarted" connection does normal slow-start, one of two things
> will happen yes? Either it will grow its cwnd to >= the receiver's
> window, or it will have to stop before then because it triggered a
> packet loss.
>
> In the first case, seems it would have been just as good to let the
> connection burst.
>
> In the second case, is the effect on other connections really any better
> than if the connection just started-up from where it was before?
>
> BTW, is the RFC 2681? I looked that one up on ietf.org and the RFC by
> that number was a different beast entirely - at least at a very quick
> glance.
>
> rick jones
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
http://www.faqs.org/rfcs/rfc2861.html
Long periods when the sender is application-limited can lead to the
invalidation of the congestion window. During periods when the TCP
sender is network-limited, the value of the congestion window is
repeatedly "revalidated" by the successful transmission of a window
of data without loss. When the TCP sender is network-limited, there
is an incoming stream of acknowledgements that "clocks out" new data,
giving concrete evidence of recent available bandwidth in the
network. In contrast, during periods when the TCP sender is
application-limited, the estimate of available capacity represented
by the congestion window may become steadily less accurate over time.
In particular, capacity that had once been used by the network-
limited connection might now be used by other traffic.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-26 22:16 ` Rick Jones
2006-04-26 22:27 ` Stephen Hemminger
@ 2006-04-26 22:33 ` David S. Miller
1 sibling, 0 replies; 11+ messages in thread
From: David S. Miller @ 2006-04-26 22:33 UTC (permalink / raw)
To: rick.jones2; +Cc: jheffner, zach.brown, netdev
From: Rick Jones <rick.jones2@hp.com>
Date: Wed, 26 Apr 2006 15:16:18 -0700
> BTW, is the RFC 2681? I looked that one up on ietf.org and the RFC by
> that number was a different beast entirely - at least at a very quick
> glance.
Congestion window validation is the correct RFC.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-26 22:27 ` Stephen Hemminger
@ 2006-04-26 22:44 ` Rick Jones
0 siblings, 0 replies; 11+ messages in thread
From: Rick Jones @ 2006-04-26 22:44 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David S. Miller, jheffner, zach.brown, netdev
>>BTW, is the RFC 2681? I looked that one up on ietf.org and the RFC by
>>that number was a different beast entirely - at least at a very quick
>>glance.
>>
>>rick jones
>>-
>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> http://www.faqs.org/rfcs/rfc2861.html
thanks.
> Long periods when the sender is application-limited can lead to the
> invalidation of the congestion window. During periods when the TCP
> sender is network-limited, the value of the congestion window is
> repeatedly "revalidated" by the successful transmission of a window
> of data without loss. When the TCP sender is network-limited, there
> is an incoming stream of acknowledgements that "clocks out" new data,
> giving concrete evidence of recent available bandwidth in the
> network. In contrast, during periods when the TCP sender is
> application-limited, the estimate of available capacity represented
> by the congestion window may become steadily less accurate over time.
> In particular, capacity that had once been used by the network-
> limited connection might now be used by other traffic.
May, might, could... :)
What concerned me the most was section 5, where the experiments were for
dial-up connections and an interactive user then cat'ing a large file to
the screen. How often does someone "list a moderately large file"
without using less or more? And the bit about the second experiment
with the real modem bank not showing any difference in what the user
experienced because the bank had buffering was interesting. It suggests
(to me anyway) that perhaps the TCP receive window was too large for a
modem connection in the first place. Leaves me wondering what effect
Linux's moderated receive window would have on that experiment.
rick jones
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-26 21:45 ` David S. Miller
2006-04-26 22:16 ` Rick Jones
@ 2006-04-26 23:25 ` Zach Brown
2006-04-27 17:47 ` John Heffner
2 siblings, 0 replies; 11+ messages in thread
From: Zach Brown @ 2006-04-26 23:25 UTC (permalink / raw)
To: David S. Miller; +Cc: jheffner, netdev
>> Given that RFC2681 is Experimental (and I'm not aware of any current
>> efforts in the IETF to push it to the standard track), IHMO it would not
>> be inappropriate to make this behavior controlled via sysctl.
>
> I have to respectfully disagree.
OK, thanks for taking the time to look at it.
- z
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-26 21:45 ` David S. Miller
2006-04-26 22:16 ` Rick Jones
2006-04-26 23:25 ` Zach Brown
@ 2006-04-27 17:47 ` John Heffner
2006-04-27 20:19 ` David S. Miller
2 siblings, 1 reply; 11+ messages in thread
From: John Heffner @ 2006-04-27 17:47 UTC (permalink / raw)
To: David S. Miller; +Cc: zach.brown, netdev
David S. Miller wrote:
> From: John Heffner <jheffner@psc.edu>
>> Given that RFC2681 is Experimental (and I'm not aware of any current
>> efforts in the IETF to push it to the standard track), IHMO it would not
>> be inappropriate to make this behavior controlled via sysctl.
>
> I have to respectfully disagree.
>
> This is the price you pay when the network's congestion is being
> measured by probing, information becomes stale over time if you don't
> send any probes.
>
> And this change of congestion state is real and happens frequently for
> most end to end users.
>
> When you're bursty application is not sending, other flows can take up
> the pipe space you are not using, and you must reprobe to figure that
> out.
A lot of the time doing 2861 is a good thing, since if you have a long
pause, you've lost your ack clock, and you don't want to send a
window-sized burst because you'll probably overflow a queue somewhere
and step on your own feet. Since we don't have a pacing mechanism, a
slow start is really the only way to do this.
I don't entirely buy the "staleness" argument. I don't think that *not*
doing 2861 will affect the stability of congestion control, since all of
the response mechanisms are still in place. (Most OS's don't do 2861,
and it is not a standard.) If you have a long RTT, short RTT flows can
make a big difference in congestion in a period much smaller than your
timeout. In fact, congestion information is *always* stale by the time
you get it. :)
Sometimes having cwnd validation turned on will make your applications
perform better, sometimes worse. I don't think it would be incorrect to
add a switch. One question is whether it's worth adding the switch
(i.e., do enough people care?).
Myself, I'd be interested to see some quantitative comparisons of
performance with a "real" application affected by this.
Thanks,
-John
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-27 17:47 ` John Heffner
@ 2006-04-27 20:19 ` David S. Miller
2006-04-27 21:12 ` Rick Jones
0 siblings, 1 reply; 11+ messages in thread
From: David S. Miller @ 2006-04-27 20:19 UTC (permalink / raw)
To: jheffner; +Cc: zach.brown, netdev
From: John Heffner <jheffner@psc.edu>
Date: Thu, 27 Apr 2006 13:47:33 -0400
> (Most OS's don't do 2861, and it is not a standard.)
Are you so sure? Doing cwnd timeout largely predates the congestion
window validation work, in fact by several years.
In RFC 2581, it mentions Van Jacobson's recommendation of this idle
period behavior, as just one example.
Your arguments about "all the feedback mechanisms are in place, so not
reducing the cwnd after idle doesn't hurt congestion control" could be
applied to the packet retransmit timeout handling of the congestion
window, and I think that's kind of silly.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: tune back idle cwnd closing?
2006-04-27 20:19 ` David S. Miller
@ 2006-04-27 21:12 ` Rick Jones
0 siblings, 0 replies; 11+ messages in thread
From: Rick Jones @ 2006-04-27 21:12 UTC (permalink / raw)
To: David S. Miller; +Cc: jheffner, zach.brown, netdev
having looked now at both 2861 and the 99 paper it references I see lots
of "may's" "mights" and "belief" but nothing "real world."
the CWV vs non CWV was done against a TCP that did indeed reset cwnd
after an RTT of idle, so it wasn't showing reset at idle versus no reset
at idle. just CWV's less draconian (?) reset than the non CWV stack.
the experimental validation in the 99 paper was still a simulation using
dummynet and a number of buffers rather smaller than what modem banks
were offering at the time, and it was for a modem, rather than any other
sort of link. and when they did use a real modem, the buffering in the
modem bank seems to have made the whole thing moot.
there was nothing about effect on intranets, or high-speed long hauls or
any of that.
what that means wrt having a sysctl to enable/disable functionality
still listed as experimental i'm not sure
rick jones
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-04-27 21:12 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-21 19:58 tune back idle cwnd closing? Zach Brown
2006-04-25 14:27 ` John Heffner
2006-04-26 21:45 ` David S. Miller
2006-04-26 22:16 ` Rick Jones
2006-04-26 22:27 ` Stephen Hemminger
2006-04-26 22:44 ` Rick Jones
2006-04-26 22:33 ` David S. Miller
2006-04-26 23:25 ` Zach Brown
2006-04-27 17:47 ` John Heffner
2006-04-27 20:19 ` David S. Miller
2006-04-27 21:12 ` Rick Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).