From: Eric Dumazet <eric.dumazet@gmail.com>
To: Kieran Mansley <kmansley@solarflare.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>, netdev@vger.kernel.org
Subject: Re: TCPBacklogDrops during aggressive bursts of traffic
Date: Tue, 22 May 2012 18:45:35 +0200 [thread overview]
Message-ID: <1337705135.3361.226.camel@edumazet-glaptop> (raw)
In-Reply-To: <1337704382.1698.53.camel@kjm-desktop.uk.level5networks.com>
On Tue, 2012-05-22 at 17:32 +0100, Kieran Mansley wrote:
> On Tue, 2012-05-22 at 18:12 +0200, Eric Dumazet wrote:
> >
> > __tcp_select_window() ( more precisely tcp_space() takes into account
> > memory used in receive/ofo queue, but not frames in backlog queue)
> >
> > So if you send bursts, it might explain TCP stack continues to
> > advertise
> > a too big window, instead of anticipate the problem.
> >
> > Please try the following patch :
> >
> > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > index e79aa48..82382cb 100644
> > --- a/include/net/tcp.h
> > +++ b/include/net/tcp.h
> > @@ -1042,8 +1042,9 @@ static inline int tcp_win_from_space(int space)
> > /* Note: caller must be prepared to deal with negative returns */
> > static inline int tcp_space(const struct sock *sk)
> > {
> > - return tcp_win_from_space(sk->sk_rcvbuf -
> > - atomic_read(&sk->sk_rmem_alloc));
> > + int used = atomic_read(&sk->sk_rmem_alloc) +
> > sk->sk_backlog.len;
> > +
> > + return tcp_win_from_space(sk->sk_rcvbuf - used);
> > }
> >
> > static inline int tcp_full_space(const struct sock *sk)
>
>
> I can give this a try (not sure when - probably later this week) but I
> think this it is back to front. The patch above will reduce the
> advertised window by sk_backlog.len, but at the time that the window was
> advertised that allowed the dropped packets to be sent the backlog was
> empty. It is later, when the kernel is waking the application and takes
> the socket lock that the backlog starts to be used and the drop happens.
> But reducing the window advertised at this point is futile - the packets
> that will be dropped are already in flight.
>
Not really. If we receive these packets while backlog is empty, then the
sender violates TCP rules.
We advertise tcp window directly from memory we are allowed to consume.
(On the premise sender behaves correctly, not sending bytes in small
packets)
> The problem exists because the backlog has a tighter limit on it than
> the receive window does; I think the backlog should be able to accept
> sk_rcvbuf bytes in addition to what is already in the receive buffer (or
> up to the advertised receive window if that's smaller). At the moment
> it will only accept sk_rcvbuf bytes including what is already in the
> receive buffer. The logic being that in this case we're using the
> backlog because it's in the process of emptying the receive buffer into
> the application, and so the receive buffer will very soon be empty, and
> so we will very soon be able to accept sk_rcvbuf bytes. This is evident
> from the packet capture as the kernel stack is quite happy to accept the
> significant quantity of data that arrives as part of the same burst
> immediately after it has dropped a couple of packets.
>
This is not evident from the capture, you are mistaken.
tcpdump captures packets before tcp stack, it doesnt say if they are :
1) queued in receive of ofo queue
2) queued in socket backlog
3) dropped because we hit socket rcvbuf limit
If socket lock is hold by the user, packets are queued to backlog, or
dropped.
Then, when socket lock is about to be released, we process the backlog.
next prev parent reply other threads:[~2012-05-22 16:45 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-15 14:38 TCPBacklogDrops during aggressive bursts of traffic Kieran Mansley
2012-05-15 14:56 ` Eric Dumazet
2012-05-15 15:00 ` Eric Dumazet
2012-05-15 16:29 ` Kieran Mansley
2012-05-15 16:34 ` Eric Dumazet
2012-05-15 16:47 ` Ben Hutchings
2012-05-15 17:01 ` Eric Dumazet
2012-05-15 17:23 ` Eric Dumazet
2012-05-17 16:31 ` Kieran Mansley
2012-05-17 16:37 ` Eric Dumazet
2012-05-18 15:45 ` Kieran Mansley
2012-05-18 15:49 ` Eric Dumazet
2012-05-18 15:53 ` Kieran Mansley
2012-05-18 18:40 ` Eric Dumazet
2012-05-22 8:20 ` Kieran Mansley
2012-05-22 9:25 ` Eric Dumazet
2012-05-22 9:30 ` Eric Dumazet
2012-05-22 15:09 ` Kieran Mansley
2012-05-22 16:12 ` Eric Dumazet
2012-05-22 16:32 ` Kieran Mansley
2012-05-22 16:45 ` Eric Dumazet [this message]
2012-05-22 20:54 ` Eric Dumazet
2012-05-23 9:44 ` Eric Dumazet
2012-05-23 12:09 ` Eric Dumazet
2012-05-23 16:04 ` Alexander Duyck
2012-05-23 16:12 ` Eric Dumazet
2012-05-23 16:39 ` Eric Dumazet
2012-05-23 17:10 ` Alexander Duyck
2012-05-23 21:19 ` Alexander Duyck
2012-05-23 21:37 ` Eric Dumazet
2012-05-23 22:03 ` Alexander Duyck
2012-05-23 16:58 ` Alexander Duyck
2012-05-23 17:24 ` Eric Dumazet
2012-05-23 17:57 ` Alexander Duyck
2012-05-23 17:34 ` David Miller
2012-05-23 17:46 ` Eric Dumazet
2012-05-23 17:57 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1337705135.3361.226.camel@edumazet-glaptop \
--to=eric.dumazet@gmail.com \
--cc=bhutchings@solarflare.com \
--cc=kmansley@solarflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox