Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Cong Wang <amwang@redhat.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Patrick McHardy <kaber@trash.net>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT
Date: Fri, 28 Sep 2012 14:33:07 +0800	[thread overview]
Message-ID: <1348813987.7264.41.camel@cr0> (raw)
In-Reply-To: <20120927142334.GA3194@neilslaptop.think-freely.org>

On Thu, 2012-09-27 at 10:23 -0400, Neil Horman wrote:
> On Thu, Sep 27, 2012 at 04:41:01PM +0800, Cong Wang wrote:
> > Some customer requests this feature, as they stated:
> > 
> > 	"This parameter is necessary, especially for software that continually 
> >         creates many ephemeral processes which open sockets, to avoid socket 
> >         exhaustion. In many cases, the risk of the exhaustion can be reduced by 
> >         tuning reuse interval to allow sockets to be reusable earlier.
> > 
> >         In commercial Unix systems, this kind of parameters, such as 
> >         tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have 
> >         already been available. Their implementations allow users to tune 
> >         how long they keep TCP connection as TIME-WAIT state on the 
> >         millisecond time scale."
> > 
> > We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
> > are not equivalent in that they cannot be tuned directly on the time
> > scale nor in a safe way, as some combinations of tunings could still
> > cause some problem in NAT. And, I think second scale is enough, we don't
> > have to make it in millisecond time scale.
> > 
> I think I have a little difficultly seeing how this does anything other than
> pay lip service to actually having sockets spend time in TIME_WAIT state.  That
> is to say, while I see users using this to just make the pain stop.  If we wait
> less time than it takes to be sure that a connection isn't being reused (either
> by waiting two segment lifetimes, or by checking timestamps), then you might as
> well not wait at all.  I see how its tempting to be able to say "Just don't wait
> as long", but it seems that theres no difference between waiting half as long as
> the RFC mandates, and waiting no time at all.  Neither is a good idea.

I don't think reducing TIME_WAIT is a good idea either, but there must
be some reason behind as several UNIX provides a microsecond-scale
tuning interface, or maybe in non-recycle mode, their RTO is much less
than 2*MSL?

> 
> Given the problem you're trying to solve here, I'll ask the standard question in
> response: How does using SO_REUSEADDR not solve the problem?  Alternatively, in
> a pinch, why not reduce the tcp_max_tw_buckets sufficiently to start forcing
> TIME_WAIT sockets back into CLOSED state?
> 
> The code looks fine, but the idea really doesn't seem like a good plan to me.
> I'm sure HPUX/Solaris/AIX/etc have done this in response to customer demand, but
> that doesn't make it the right solution.
> 

*I think* the customer doesn't want to modify their applications, so
that is why they don't use SO_REUSERADDR.

I didn't know tcp_max_tw_buckets can do the trick, nor the customer, so
this is a side effect of tcp_max_tw_buckets? Is it documented?

Thanks.

next prev parent reply	other threads:[~2012-09-28  6:34 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-27  8:41 [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy the time of TIME-WAIT Cong Wang
2012-09-27 14:23 ` Neil Horman
2012-09-27 17:02   ` Rick Jones
2012-09-28  6:33   ` Cong Wang [this message]
2012-09-28  6:43     ` David Miller
2012-09-28 17:30       ` Rick Jones
2012-09-28 13:16     ` Neil Horman
2012-10-02  7:04       ` Cong Wang
2012-10-02 12:09         ` Neil Horman
2012-10-08  3:17           ` Cong Wang
2012-10-08 14:07             ` Neil Horman
2012-10-09  3:42               ` Cong Wang
2012-09-27 17:05 ` David Miller
2012-09-28  6:39   ` Cong Wang
2012-09-28  6:44     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1348813987.7264.41.camel@cr0 \
    --to=amwang@redhat.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kaber@trash.net \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.