From: Willy Tarreau <w@1wt.eu>
To: Neal Cardwell <ncardwell@google.com>
Cc: Lutz Vieweg <lvml@5t9.de>, David Miller <davem@davemloft.net>,
Soheil Hassas Yeganeh <soheil.kdev@gmail.com>,
Netdev <netdev@vger.kernel.org>,
Soheil Hassas Yeganeh <soheil@google.com>,
Eric Dumazet <edumazet@google.com>,
Yuchung Cheng <ycheng@google.com>,
Florian Westphal <fw@strlen.de>
Subject: Re: [PATCH net-next 1/2] tcp: remove per-destination timestamp cache
Date: Thu, 16 Mar 2017 17:05:44 +0100 [thread overview]
Message-ID: <20170316160544.GD15641@1wt.eu> (raw)
In-Reply-To: <CADVnQy=Bxs-e-KxQsA8nPSfm9fVtogouxCRdgmrR0WyBkMS=Xw@mail.gmail.com>
Hi Neal,
On Thu, Mar 16, 2017 at 11:40:52AM -0400, Neal Cardwell wrote:
> On Thu, Mar 16, 2017 at 7:31 AM, Lutz Vieweg <lvml@5t9.de> wrote:
> >
> > On 03/15/2017 11:55 PM, Willy Tarreau wrote:
> >>
> >> At least I can say I've seen many people enable it without understanding its impact, confusing it
> >> with tcp_tw_reuse, and copy-pasting it from random blogs and complaining about issues in
> >> production.
> >
> >
> > I currently wonder: What it the correct advise to an operator who needs
> > to run one server instance that is meant to accept thousands of new,
> > short-lived TCP connections per minute?
>
> Note that for this to be a problem there would have to be thousands of
> new, short-lived TCP connections per minute from a single source IP
> address to a single destination IP address. Normal client software
> should not be doing this. AFAIK this is pretty rare, unless someone is
> running a load test or has an overly-aggressive monitoring system. NAT
> boxes or proxies with that kind of traffic should be running with
> multiple public source IPs.
In fact it's the regular stuff with reverse-proxies. You can scan the
whole source port range every second. But when enabling timestamps, you
benefit from PAWS and you don't have any problem anymore, everything
works pretty well.
> But if/when the problem occurs, then the feasible solutions I'm aware
> of, in approximate descending order of preference, are:
>
> (1) use longer connections from the client side (browsers and RPC libraries are
> usually pretty good about keeping connections open for a long time, so this
> is usually sufficient)
>
> (2) have the client do the close(), so the client is the side to carry the
> TIME_WAIT state
That's impossible for proxies, as you can't connect again from the same
source port, causing the performances to be divided by more than 100. What
proxies have to do when they're forced to close first an outgoing connection
is to set SO_LINGER to (0,0) so that an RST is used and the source port can
be reused. But as you guess, if that RST gets lost, then next opening is
not that beautiful : either [SYN, ACK, RST, pause, SYN, SYN-ACK, ACK] or
[SYN, RST, pause SYN, SYN-ACK, ACK] depending on whether the SYN appears
in the previous window or not.
> (3) have the server use SO_LINGER with a timeout of 0, so that
> the connection is closed with a RST and the server carries no
> TIME_WAIT state
The problem is that it also kills the tail data.
Quite frankly, the only issues I'm used to see are with clients closing
first and with reusing source connections. As soon as timestamps are
enabled on both sides and people don't blindly play with tcp_tw_recycle,
I really never face any connection issue.
Willy
next prev parent reply other threads:[~2017-03-16 16:06 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-15 20:30 [PATCH net-next 1/2] tcp: remove per-destination timestamp cache Soheil Hassas Yeganeh
2017-03-15 20:30 ` [PATCH net-next 2/2] tcp: remove tcp_tw_recycle Soheil Hassas Yeganeh
2017-03-15 22:40 ` [PATCH net-next 1/2] tcp: remove per-destination timestamp cache David Miller
2017-03-15 22:55 ` Willy Tarreau
2017-03-16 11:31 ` Lutz Vieweg
2017-03-16 15:40 ` Neal Cardwell
2017-03-16 16:05 ` Willy Tarreau [this message]
2017-03-16 17:30 ` Lutz Vieweg
2017-03-15 22:57 ` Florian Westphal
2017-03-15 23:45 ` David Miller
2017-03-15 22:59 ` Eric Dumazet
2017-03-15 23:45 ` David Miller
2017-03-16 0:06 ` Eric Dumazet
2017-03-19 7:53 ` Alexander Alemayhu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170316160544.GD15641@1wt.eu \
--to=w@1wt.eu \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=lvml@5t9.de \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=soheil.kdev@gmail.com \
--cc=soheil@google.com \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.