linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Harris <jgh@exim.org>
To: Neal Cardwell <ncardwell@google.com>
Cc: netdev@vger.kernel.org, linux-api@vger.kernel.org, edumazet@google.com
Subject: Re: [PATCH 0/6] tcp: support preloading data on a listening socket
Date: Fri, 16 May 2025 21:10:39 +0100	[thread overview]
Message-ID: <ff23f425-536c-43b7-b536-58d99e61f2f8@exim.org> (raw)
In-Reply-To: <CADVnQymxsOGLnUfurhDLXNUaK4gpaYm2zTDEWRxy8JPqH6O6vg@mail.gmail.com>

Hi Neal,

Thanks for the initial review.


On 2025/05/16 7:19 PM, Neal Cardwell wrote:
> On Fri, May 16, 2025 at 11:55 AM Jeremy Harris <jgh@exim.org> wrote:
>>
>> Support write to a listen TCP socket, for immediate
>> transmission on passive connection establishments.
>>
>> On a normal connection transmission is triggered by the receipt of
>> the 3rd-ack. On a fastopen (with accepted cookie) connection the data
>> is sent in the synack packet.
>>
>> The data preload is done using a sendmsg with a newly-defined flag
>> (MSG_PRELOAD); the amount of data limited to a single linear sk_buff.
>> Note that this definition is the last-but-two bit available if "int"
>> is 32 bits.
> 
> Can you please add a bit more context, like:
> 
> + What is the motivating use case? (Accelerating Exim?)

Accelerating any server-first ULP, SMTP being the major use I
know of (and yes, Exim is my primary testcase and is operational
against a test kernel with this patch series).

One caveat: the initial server data cannot change from one passive
connection to another.

> Is this
> targeted for connections using encryption (like TLS/SSL), or just
> plain-text connections?

TLS-on-connect cannot benefit, being client-first.  SMTP that uses
STARTTLS can take advantage of it, as can plaintext SMTP.

I would not expect https to be able to use it.


> + What are the exact performance improvements you are seeing in your
> benchmarks that (a) motivate this, and (b) justify any performance
> impact on the TCP stack?

Because of the lack of userland roundtrip needed for the initial server
data, there is a latency benefit.  This is better for the TFO-C case,
but also significant for the non-TFO case.

Packet capture (laptop, loopback, TFO-C case) for initial SYN to first
client data packet (5 samples):

- baseline   TFO_C      1064 1470 1455 1547 1595  usec
- patched    non-TFO     140  150  159  144  153  usec
- patched    TFO_C       142  149  149  125  125  usec



One fewer packet is sent by the server in most packet captures, sometimes
one fewer in each direction.  There is one less application kernel entry/exit
on the server.

I'm hoping those differences will add up to both less cpu time (on both
endpoints) and less wire-time.  However, I have not run benchmarks looking
for a change in peak rate of connection-handling.



In summary, this is the mirror of TCP Fast Open client data: the latency
benefit is probably the most useful aspect.


> + Regarding "Support write to a listen TCP socket, for immediate
> transmission on passive connection establishments.": can you please
> make it explicitly clear whether the data written to the listening
> socket is saved and transmitted on all future successful passive
> sockets that are created for the listener,

This.  The data is copied for each future passive socket from this
listener,

> or is just transmitted on
> the next connection that is created?

(and not this option).




I'll copy these comments in any future v2.
As Eric says, I should run KASAN/syzbot first.

-- 
Cheers,
   Jeremy

      reply	other threads:[~2025-05-16 20:10 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-16 15:54 [PATCH 0/6] tcp: support preloading data on a listening socket Jeremy Harris
2025-05-16 15:54 ` [PATCH 1/6] tcp: support writing to a socket in listening state Jeremy Harris
2025-05-16 15:55 ` [PATCH 2/6] tcp: copy write-data from listen socket to accept child socket Jeremy Harris
2025-05-16 17:51   ` Eric Dumazet
2025-05-16 20:11     ` Jeremy Harris
2025-05-16 15:55 ` [PATCH 3/6] tcp: fastopen: add write-data to fastopen synack packet Jeremy Harris
2025-05-16 15:55 ` [PATCH 4/6] tcp: transmit any pending data on receipt of 3rd-ack Jeremy Harris
2025-05-16 15:55 ` [PATCH 5/6] tcp: fastopen: retransmit data when only the SYN of a synack-with-data is acked Jeremy Harris
2025-05-16 15:55 ` [PATCH 6/6] tcp: fastopen: extend retransmit-queue trimming to handle linear sk_buff Jeremy Harris
2025-05-16 16:58 ` [PATCH 0/6] tcp: support preloading data on a listening socket Jeremy Harris
2025-05-16 18:19 ` Neal Cardwell
2025-05-16 20:10   ` Jeremy Harris [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff23f425-536c-43b7-b536-58d99e61f2f8@exim.org \
    --to=jgh@exim.org \
    --cc=edumazet@google.com \
    --cc=linux-api@vger.kernel.org \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).