From: Jeremy Harris <jgh@exim.org>
To: netdev@vger.kernel.org
Cc: linux-api@vger.kernel.org, edumazet@google.com,
ncardwell@google.com, Jeremy Harris <jgh@exim.org>
Subject: [PATCH net-next v3 0/6] tcp: support preloading data on a listening socket
Date: Mon, 9 Jun 2025 16:56:26 +0100 [thread overview]
Message-ID: <cover.1749466540.git.jgh@exim.org> (raw)
I didn't get any comments on v2 apart from the kernel test robot
so I'm repeating the same resposes to v1 comment here.
I figured I should do a v3 to fix the compiler warnings the robot
pointed out.
v2 changes:
- Split out the preload operation to a separate routine from
tcp_sendmsg_locked() and restrict from looping over the supplied
iovec
v3 changes:
- Fix compiler warnings
------
Support write to a listen TCP socket, for immediate
transmission on all later passive connection establishments
parented by the listen socket.
On a normal connection transmission of the data is triggered by the receipt
of the 3rd-ack. On a fastopen (with accepted cookie) connection the data
is sent in the synack packet.
The data preload is done using a sendmsg with a newly-defined flag
(MSG_PRELOAD); the amount of data limited to a single linear sk_buff.
Note that this definition is the last-but-two bit available if "int"
is 32 bits.
Intent: lower latency for server-first protocols using TCP.
Known cases of this use are SMTP and MySQL.
Measurements:
Packet capture (laptop, loopback, TFO requeste) for initial SYN to first
client data packet (5 samples):
- baseline TFO-C 1064 1470 1455 1547 1595 usec
- patched non-TFO 140 150 159 144 153 usec
- patched TFO-C 142 149 149 125 125 usec
Out of scope:
- Client-first protocols
- TLS-on-connect
Testing:
A) packetdrill scripts for
- normal non-TFO
- normal TFO
- synack lost
- 3rd-ack acks only the SYN
- 3rd-ack acks partial data
(NB: packetdrill can only check the data size, not actual content)
B) Application use, running the application testsuite
and manual check of specific cases via packet capture
C) Daily-driver laptop use (not expected to trigger the feature;
only regression-test)
D) KASAN/syzkaller
- enable_syscalls: "socket$inet_tcp", "listen", "sendmsg", "accept",
"read", "write", "close", "syz_emit_ethernet", "syz_extract_tcp_res"
- the coverage seems rather limited; the sendmsg onto a listen socket
is there, but I am not convinced actual TCP connections are being
excercised. tcp_input.c only 2%; tcp_minisocks.c is entirely uncovered.
- A need for limiting iteration in the sendmesg handling was found (RCU
timeouts), hence v2, but no hint of locking problems.
Eric: could you expand on your previous comment "I do not see any
locking"? If it referred to the syscall write operation on the listening
socket, tcp_sendmsg_locked() is called with the sk locked - so I'm
unsure where you're looking.
Jeremy Harris (6):
tcp: support writing to a socket in listening state
tcp: copy write-data from listen socket to accept child socket
tcp: fastopen: add write-data to fastopen synack packet
tcp: transmit any pending data on receipt of 3rd-ack
tcp: fastopen: retransmit data when only the SYN of a synack-with-data
is acked
tcp: fastopen: extend retransmit-queue trimming to handle linear
sk_buff
include/linux/socket.h | 1 +
net/ipv4/tcp.c | 112 ++++++++++++++++++
net/ipv4/tcp_fastopen.c | 3 +-
net/ipv4/tcp_input.c | 15 ++-
net/ipv4/tcp_ipv4.c | 4 +-
net/ipv4/tcp_minisocks.c | 58 ++++++++-
net/ipv4/tcp_output.c | 50 +++++++-
.../perf/trace/beauty/include/linux/socket.h | 1 +
tools/perf/trace/beauty/msg_flags.c | 3 +
9 files changed, 234 insertions(+), 13 deletions(-)
base-commit: 2c7e4a2663a1ab5a740c59c31991579b6b865a26
--
2.49.0
next reply other threads:[~2025-06-09 15:56 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-09 15:56 Jeremy Harris [this message]
2025-06-09 16:05 ` [PATCH net-next v3 1/6] tcp: support writing to a socket in listening state Jeremy Harris
2025-06-09 16:05 ` [PATCH net-next v3 2/6] tcp: copy write-data from listen socket to accept child socket Jeremy Harris
2025-06-09 16:26 ` Eric Dumazet
2025-06-09 16:05 ` [PATCH net-next v3 3/6] tcp: fastopen: add write-data to fastopen synack packet Jeremy Harris
2025-06-09 16:05 ` [PATCH net-next v3 4/6] tcp: transmit any pending data on receipt of 3rd-ack Jeremy Harris
2025-06-09 16:05 ` [PATCH net-next v3 5/6] tcp: fastopen: retransmit data when only the SYN of a synack-with-data is acked Jeremy Harris
2025-06-09 16:05 ` [PATCH net-next v3 6/6] tcp: fastopen: extend retransmit-queue trimming to handle linear sk_buff Jeremy Harris
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1749466540.git.jgh@exim.org \
--to=jgh@exim.org \
--cc=edumazet@google.com \
--cc=linux-api@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.