netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Julian Anastasov <ja@ssi.bg>, David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org
Subject: Re: TCP_DEFER_ACCEPT is missing counter update
Date: Fri, 16 Oct 2009 09:19:02 +0200	[thread overview]
Message-ID: <20091016071902.GA11244@1wt.eu> (raw)
In-Reply-To: <4AD81C0B.90804@gmail.com>

On Fri, Oct 16, 2009 at 09:08:59AM +0200, Eric Dumazet wrote:
(...)
> > Yes it could differ if a pure ACK is lost between the client and the server,
> > but in my opinion what is important is not to precisely account the number
> > of ACKs to ensure we wake up exactly after XXX ACKs received, but that in
> > most common situations we avoid to wake up too early.
> > 
> 
> We basically same thing, but you misundertood me. I was concerning about
> one lost (server -> client SYN-ACK), not a lost (client -> server ACK) which is fine
> (even without playing with TCP_DEFER_ACCEPT at all)
> 
> In this case, if we do the retrans test, we'll accept the first (client -> server ACK)
> and wakeup the application, while most probably we'll receive the client request
>  few milli second later.

OK I get your point. We can detect that though, as Julian explained it, with
the ->acked field. It indicates we got an ACK, which proves the SYN-ACK was
received. At first glance, I think that Julian's algorithm explained at the
end of his mail exactly covers all cases without using any additional field,
though this is not an issue anyway.

> > Also, keep in mind that the TCP_DEFER_ACCEPT parameter is passed in number
> > of seconds by the application, which are in turn converted to a number of
> > retransmits based on our own timer, which means that our SYN-ACK counter
> > is what most closely matches the application's expected delay, even if an
> > ACK from the client gets lost in between or if a client's stack retransmits
> > pure ACKs very fast for any implementation-specific reason.
> > 
> 
> Well, this is why converting application delay (sockopt() argument) in second units
> to a number of SYN-ACK counter is subobptimal and error prone.

I agree, but it allows the application to be unware of retransmit timers.

> This might be changed to be mapped to what documentation states : a number of seconds,
> or even better a number of milli seconds (new TCP_DEFER_ACCEPT_MS setsockopt cmd),
> because a high performance server wont play with > 1 sec values anyway.

It would be nice but it would require a new timer. Current implementation
does not need any and is efficient enough for most common cases. In fact it
would have been better to simply be able to specify that we want to skip one
empty ACK (or X empty ACKs). But let's make use of what we currently have,
with your (or Julian's) changes, it should cover almost all usages without
changing semantics for applications.

Regards,
Willy


  reply	other threads:[~2009-10-16  7:19 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-13  5:07 TCP_DEFER_ACCEPT is missing counter update Willy Tarreau
2009-10-13  7:11 ` David Miller
2009-10-13  7:19   ` Willy Tarreau
2009-10-13  7:27     ` David Miller
2009-10-13 21:27     ` Julian Anastasov
2009-10-14  4:52       ` Willy Tarreau
2009-10-14  7:27         ` Julian Anastasov
2009-10-14 20:17           ` Willy Tarreau
2009-10-14 21:12             ` Olaf van der Spek
2009-10-14 22:43             ` David Miller
2009-10-15  6:08               ` Willy Tarreau
2009-10-15  8:47                 ` Julian Anastasov
2009-10-15 12:41                   ` Willy Tarreau
2009-10-15 22:44                     ` Julian Anastasov
2009-10-16  3:51                       ` Eric Dumazet
2009-10-16  5:00                         ` Eric Dumazet
2009-10-16  5:29                           ` Willy Tarreau
2009-10-16  6:05                             ` Eric Dumazet
2009-10-16  6:18                               ` Willy Tarreau
2009-10-16  7:08                                 ` Eric Dumazet
2009-10-16  7:19                                   ` Willy Tarreau [this message]
2009-10-16  5:03                       ` Willy Tarreau
2009-10-16  8:49                         ` Julian Anastasov
2009-10-16 10:40                           ` Eric Dumazet
2009-10-16 19:27                             ` Willy Tarreau
2009-10-17 11:48                             ` Julian Anastasov
2009-10-17 12:07                               ` Eric Dumazet
2009-10-17 14:20                                 ` Julian Anastasov
2009-10-19 20:01                                   ` Eric Dumazet
2009-10-19 20:11                                     ` Willy Tarreau
2009-10-19 20:17                                       ` Eric Dumazet
2009-10-20  2:23                                     ` David Miller
2009-10-15  7:59               ` Julian Anastasov
2009-10-16 10:08           ` Ilpo Järvinen
2009-10-13  7:23 ` Eric Dumazet
2009-10-13  7:34   ` Willy Tarreau
2009-10-13  8:08     ` Olaf van der Spek
2009-10-13  8:29     ` Eric Dumazet
2009-10-13  8:35       ` David Miller
2009-10-13  7:35   ` David Miller
2009-10-13  8:12     ` Willy Tarreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091016071902.GA11244@1wt.eu \
    --to=w@1wt.eu \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=ja@ssi.bg \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).