From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [PATCH] Add TCPCONG target to patch-o-matic Date: Wed, 11 Oct 2006 13:04:13 -0700 Message-ID: <20061011130413.6d44063f@freekitty> References: <45235EDC.4080709@aarnet.edu.au> <4524118B.4020903@netfilter.org> <4524BBB2.2000109@aarnet.edu.au> <452A66CA.70603@trash.net> <452B4512.4070705@aarnet.edu.au> <452C81E2.1040206@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Mailinglist , Netfilter, Glen Turner Return-path: To: Patrick McHardy In-Reply-To: <452C81E2.1040206@trash.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: netfilter-devel-bounces@lists.netfilter.org Errors-To: netfilter-devel-bounces@lists.netfilter.org List-Id: netfilter-devel.vger.kernel.org On Wed, 11 Oct 2006 07:32:18 +0200 Patrick McHardy wrote: > Glen Turner wrote: > > Patrick McHardy wrote: > > > >> I don't think iptables is the right place to do this. It should > >> be controllable through routing IMO (which can already control > >> some aspects of congestion control). > > > > > > I too think the choice should usually be done through routing, > > with the route holding the preferred congestion control algorithm > > for traffic with that prefix. Whereas now the preferred algorithm > > is read from a global parameter. > > > > But the algorithm for a particular connection should still be > > able to be changed through iptables. > > > > Firstly, not every application can be easily altered to use > > setsockopt() to select a differing algorithm from the default. > > This is the argument used for the TCPMSS and similar targets. > > The difference is that TCPMSS changes packet data (also for > forwarded packets) and doesn't fiddle around with sockets. > > > As the range of congestion control algorithms grows sysadmins > > will want to choose differing algorithms for differing > > applications. For example, most algorithm's Ack strategies > > interact poorly with transactional and remote procedure call > > traffic, so choosing an algorithm which handles this well > > could make, say, NFS over TCP traffic run a lot faster. That's a problem with delayed ack's not the congestion control stuff. > > Secondly, I want to make it easy for kernel and protocol > > developers to run differing algorithms on differing port > > numbers to test inter-algorithm fairness. Some TCP algorithms > > are quite unfair -- unable even to share a link 50:50 between > > two identical flows started a few seconds apart. A TCPCONG > > target makes it much easier to do this testing. Now that the > > kernel developers are getting good performance on long fat > > pipes the fairness and other attributes of that performance > > will be their next concern. > > It still strikes me as a bit of a hack. Lets see what Stephen > thinks. > I don't like iptables interacting back with the socket state. The only congestion control that makes sense to be application specific is the TCP-LP stuff. The others are just duking it out, to see which one should be the default. I want to put the congestion choice in the route metrics as an option, just haven't got around to doing it that way. The only advantage do doing it in IP tables is that you can make rules by port etc. -- Stephen Hemminger