From: Florian Westphal <fw@strlen.de>
To: Michal Kubecek <mkubecek@suse.cz>,
netdev@vger.kernel.org, Florian Westphal <fw@strlen.de>,
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Subject: Re: tcp hang when socket fills up ?
Date: Tue, 17 Apr 2018 15:29:41 +0200 [thread overview]
Message-ID: <20180417132941.cutzhgbrveatrdsp@breakpoint.cc> (raw)
In-Reply-To: <20180417123437.GA19885@nautica>
Dominique Martinet <asmadeus@codewreck.org> wrote:
[ CC Jozsef ]
> Could it have something to do with the way I setup the connection?
> I don't think the "both remotes call connect() with carefully selected
> source/dest port" is a very common case..
>
> If you look at the tcpdump outputs I attached the sequence usually is
> something like
> server > client SYN
> client > server SYN
> server > client SYNACK
> client > server ACK
>
> ultimately it IS a connection, but with an extra SYN packet in front of
> it (that first SYN opens up the conntrack of the nat so that the
> client's syn can come in, the client's conntrack will be that of a
> normal connection since its first SYN goes in directly after the
> server's (it didn't see the server's SYN))
>
> Looking at my logs again, I'm seeing the same as you:
>
> This looks like the actual SYN/SYN/SYNACK/ACK:
> - 14.364090 seq=505004283 likely SYN coming out of server
> - 14.661731 seq=1913287797 on next line it says receiver
> end=505004284 so likely the matching SYN from client
> Which this time gets a proper SYNACK from server:
> 14.662020 seq=505004283 ack=1913287798
> And following final dataless ACK:
> 14.687570 seq=1913287798 ack=505004284
>
> Then as you point out some data ACK, where the scale poofs:
> 14.688762 seq=1913287798 ack=505004284+(0) sack=505004284+(0) win=229 end=1913287819
> 14.688793 tcp_in_window: sender end=1913287798 maxend=1913316998 maxwin=29312 scale=7 receiver end=505004284 maxend=505033596 maxwin=29200 scale=7
> 14.688824 tcp_in_window:
> 14.688852 seq=1913287798 ack=505004284+(0) sack=505004284+(0) win=229 end=1913287819
> 14.688882 tcp_in_window: sender end=1913287819 maxend=1913287819 maxwin=229 scale=0 receiver end=505004284 maxend=505033596 maxwin=29200 scale=7
>
> As you say, only tcp_options() will clear only on side of the scales.
> We don't have sender->td_maxwin == 0 (printed) so I see no other way
> than we are in the last else if:
> - we have after(end, sender->td_end) (end=1913287819 > sender
> end=1913287798)
> - I assume the tcp state machine must be confused because of the
> SYN/SYN/SYNACK/ACK pattern and we probably enter the next check,
> but since this is a data packet it doesn't have the tcp option for scale
> thus scale resets.
Yes, this looks correct. Jozsef, can you please have a look?
Problem seems to be that conntrack believes that ACK packet
re-initializes the connection:
595 /*
596 * RFC 793: "if a TCP is reinitialized ... then it need
597 * not wait at all; it must only be sure to use sequence
598 * numbers larger than those recently used."
599 */
600 sender->td_end =
601 sender->td_maxend = end;
602 sender->td_maxwin = (win == 0 ? 1 : win);
603
604 tcp_options(skb, dataoff, tcph, sender);
and last line clears the scale value (no wscale option in data packet).
Transitions are:
server > client SYN sNO -> sSS
client > server SYN sSS -> sS2
server > client SYNACK sS2 -> sSR /* here */
client > server ACK sSR -> sES
SYN/ACK was observed in original direction so we hit
state->state == TCP_CONNTRACK_SYN_RECV && dir == IP_CT_DIR_REPLY test
when we see the ack packet and end up in the 'TCP is reinitialized' branch.
AFAICS, without this, connection would move to sES just fine,
as the data ack is in window.
next prev parent reply other threads:[~2018-04-17 13:29 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-06 9:07 tcp hang when socket fills up ? Dominique Martinet
2018-04-13 9:42 ` Dominique Martinet
2018-04-13 15:01 ` Eric Dumazet
2018-04-13 16:32 ` Michal Kubecek
2018-04-14 1:09 ` Dominique Martinet
2018-04-14 1:39 ` Eric Dumazet
2018-04-14 1:55 ` Dominique Martinet
2018-04-16 1:47 ` Dominique Martinet
2018-04-16 2:26 ` Eric Dumazet
2018-04-16 3:55 ` Dominique Martinet
2018-04-16 4:03 ` Dominique Martinet
2018-04-16 11:01 ` Florian Westphal
2018-04-17 3:52 ` Dominique Martinet
2018-04-17 5:28 ` Eric Dumazet
2018-04-17 6:52 ` Michal Kubecek
2018-04-17 9:20 ` Michal Kubecek
2018-04-17 12:34 ` Dominique Martinet
2018-04-17 13:00 ` Michal Kubecek
2018-04-17 13:29 ` Florian Westphal [this message]
2018-04-18 8:13 ` Jozsef Kadlecsik
2018-04-18 8:30 ` Dominique Martinet
2018-04-18 9:36 ` Dominique Martinet
2018-04-18 10:27 ` Jozsef Kadlecsik
2018-04-18 11:30 ` Dominique Martinet
2018-04-18 11:37 ` Jozsef Kadlecsik
2018-04-16 20:43 ` Marcelo Ricardo Leitner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180417132941.cutzhgbrveatrdsp@breakpoint.cc \
--to=fw@strlen.de \
--cc=eric.dumazet@gmail.com \
--cc=kadlec@blackhole.kfki.hu \
--cc=marcelo.leitner@gmail.com \
--cc=mkubecek@suse.cz \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).