From: Florian Westphal <fw@strlen.de>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>, netdev@vger.kernel.org
Subject: Re: [PATCH next] tcp: use zero-window when free_space is low
Date: Mon, 16 Dec 2013 16:51:58 +0100 [thread overview]
Message-ID: <20131216155158.GB3759@breakpoint.cc> (raw)
In-Reply-To: <1387201280.19078.223.camel@edumazet-glaptop2.roam.corp.google.com>
Eric Dumazet <eric.dumazet@gmail.com> wrote:
Hi Eric,
> On Mon, 2013-12-16 at 12:15 +0100, Florian Westphal wrote:
> > Currently the kernel tries to announce a zero window when free_space
> > is below the current receiver mss estimate.
> >
> > When a sender is transmitting small packets, the receiver might be
> > unable to shrink the receive window, because
> > a) we cannot withdraw already-commited receive window, and,
> > b) we have to round the current rwin up to a multiple of the wscale factor,
> > else we would shrink the current window.
> >
> > This causes the receive buffer to fill up until the rmem limit is hit.
> > When this happens, we start dropping packets.
>
> I do not really understand the issue.
> Do you have a packetdrill test to demonstrate it ?
I am a moron and forgot to stress one crucial bit of information:
_slow_reader_ (or a reader that doesn't read from socket at all!)
I am not very familiar with packetdrill, it would look something like
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.100...0.200 connect(3, ..., ...) = 0
0.100 > S 0:0(0) <mss 1460,sackOK,TS val 100 ecr 0,nop,wscale 7>
0.200 < S. 0:0(0) ack 1 win 32792 <mss 1460,sackOK,TS val 100 ecr 100,nop,wscale 7>
0.200 > . 1:1(0) ack 1 <nop,nop,TS val 100 ecr 100>
0.300 write(3, ..., 23) = 23
0.310 write(3, ..., 23) = 23
0.320 write(3, ..., 23) = 23
0.330 write(3, ..., 23) = 23
0.340 write(3, ..., 23) = 23
0.350 write(3, ..., 23) = 23
.. repeat indefinitely ..
Reproducer (non-packetdrill):
On server:
$ nc -l -p 12345
<suspend it: CTRL-Z>
Client:
#!/usr/bin/env python
import socket
import time
sock = socket.socket()
sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
sock.connect(("192.168.4.1", 12345));
while True:
sock.send('A' * 23)
time.sleep(0.005)
socket buffer on server-side will grow until tcp_rmem[2] is hit,
at which point the client rexmits data until -EDTIMEOUT.
Code flow on server side is:
tcp_data_queue -> tcp_try_rmem_schedule -> \
tcp_prune_queue -> tcp_clamp_window()
tcp_clamp_window will then grow sk->sk_rcvbuf, up until it eventually
hits tcp_rmem[2]
Many thanks for looking into this Eric!
prev parent reply other threads:[~2013-12-16 15:52 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-16 11:15 [PATCH next] tcp: use zero-window when free_space is low Florian Westphal
2013-12-16 13:41 ` Eric Dumazet
2013-12-16 15:51 ` Florian Westphal [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131216155158.GB3759@breakpoint.cc \
--to=fw@strlen.de \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.