netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Ronny Meeus <ronny.meeus@gmail.com>
Cc: David Laight <David.Laight@aculab.com>,
	Eric Dumazet <erdnetdev@gmail.com>,
	netdev <netdev@vger.kernel.org>
Subject: Re: TCP socket send return EAGAIN unexpectedly when sending small fragments
Date: Fri, 10 Jun 2022 19:42:01 +0200	[thread overview]
Message-ID: <20220610174201.GC19540@1wt.eu> (raw)
In-Reply-To: <CAMJ=MEe3r+ZrAONTciQgU4yqtXTJJvXc0OFvJYwYg20kPGQtdA@mail.gmail.com>

On Fri, Jun 10, 2022 at 07:16:06PM +0200, Ronny Meeus wrote:
> Op vr 10 jun. 2022 om 17:21 schreef David Laight <David.Laight@aculab.com>:
> >
> > ...
> > > If the 5 queued packets on the sending side would cause the EAGAIN
> > > issue, the real question maybe is why the receiving side is not
> > > sending the ACK within the 10ms while for earlier messages the ACK is
> > > sent much sooner.
> >
> > Have you disabled Nagle (TCP_NODELAY) ?
> 
> Yes I enabled TCP_NODELAY so the Nagle algo is disabled.
> I did a lot of tests over the last couple of days but if I remember well
> enable or disable TCP_NODELAY does not influence the result.

There are many possible causes for what you're observing. For example
if your NIC has too small a tx ring and small buffers, you can imagine
that the Nx106 bytes fit in the buffers but not the N*107, which cause
a tiny delay waiting for the Tx IRQ to recycle the buffers, and that
during this time your subsequent send() are coalesced into larger
segments that are sent at once when using 107.

If you do not want packets to be sent individually and you know you
still have more to come, you need to put MSG_MORE on the send() flags
(or to disable TCP_NODELAY).

Clearly, when running with TCP_NODELAY you're asking the whole stack
"do your best to send as fast as possible", which implies "without any
consideration for efficiency optimization". I've seen a situation in the
past where it was impossible to send any extra segment after a first
unacked PUSH was in flight. Simply sending full segments was enough to
considerably increase the performance. I analysed this as a result of
the SWS avoidance algorithm and concluded that it was normal in that
situation, though I've not witnessed it anymore in a while.

So just keep in mind to try not to abuse TCP_NODELAY too much.

Willy

  reply	other threads:[~2022-06-10 17:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-10 12:48 TCP socket send return EAGAIN unexpectedly when sending small fragments Ronny Meeus
2022-06-10 14:21 ` Eric Dumazet
2022-06-10 15:16   ` Ronny Meeus
2022-06-10 15:21     ` David Laight
2022-06-10 17:16       ` Ronny Meeus
2022-06-10 17:42         ` Willy Tarreau [this message]
2022-06-10 18:14           ` Ronny Meeus
2022-06-10 17:46     ` Eric Dumazet
2022-06-10 18:21       ` Ronny Meeus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220610174201.GC19540@1wt.eu \
    --to=w@1wt.eu \
    --cc=David.Laight@aculab.com \
    --cc=erdnetdev@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=ronny.meeus@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).