Netdev List
 help / color / mirror / Atom feed
From: Mariusz Klimek <maklimek97@gmail.com>
To: netdev@vger.kernel.org
Cc: andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org,
	idosch@nvidia.com, ncardwell@google.com, shuah@kernel.org,
	kuniyu@google.com, alice@isovalent.com,
	Mariusz Klimek <maklimek97@gmail.com>
Subject: [PATCH net-next 00/10] tcp: support non-GSO jumbograms
Date: Mon,  8 Jun 2026 18:33:17 +0200	[thread overview]
Message-ID: <20260608130755.5626-1-maklimek97@gmail.com> (raw)

This series adds support for sending TCP jumbograms over MTUs above 65535,
and adds support for setting such MTUs on veth devices.

The TCP stack is already capable of receiving jumbograms and (up until
recently) sending GSO-jumbograms that get segmented before exiting the
host. However, the TCP stack doesn't support sending regular jumbograms
over high MTUs as described in RFC2675. Specifically, the TCP stack doesn't
support an MSS greater than 65535.

No device currently supports setting an MTU greater than 65535 but there
are some devices that could benefit from higher MTUs and that can
theoretically support it. For example, IPoIB in connected-mode can support
MTUs up to 2^31. Virtual devices such as veth have no physical limitations
and therefore could also support such MTUs.

This series only adds support for setting high MTUs on veth devices, with
support for other devices being delegated to future patch series. Allowing
jumbograms to pass through veth devices is useful for testing jumbogram
functionality, and also increases throughput when compared to BIG TCP (see
the benchmark numbers below).

In addition to removing the upper limit on veth MTUs, This series adds the
missing pieces that allow the TCP stack to send TCP jumbograms. This
includes a bit of code removed recently by Alice's patch series [1]. Most
of the patches in this series are rather trivial. The main problem
addressed by this series is that the TCP stack conflates the TSO segment
length with the MSS, but the TSO segment length is a 16-bit integer, which
doesn't allow for MSS > 65535. This series decouples them.

Running iperf over veth on an Intel Core i9-12900K CPU shows a 5%
improvement in throughput.

Commands:
iperf3 -s
iperf3 -c -l 500KiB ${SERVER_IP}%veth0

MTU=524280, gso disabled:
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  24.8 GBytes   213 Gbits/sec
[  5]   1.00-2.00   sec  25.5 GBytes   219 Gbits/sec
[  5]   2.00-3.00   sec  26.0 GBytes   223 Gbits/sec
[  5]   3.00-4.00   sec  25.8 GBytes   222 Gbits/sec
[  5]   4.00-5.00   sec  25.8 GBytes   221 Gbits/sec
[  5]   5.00-6.00   sec  25.4 GBytes   218 Gbits/sec
[  5]   6.00-7.00   sec  25.3 GBytes   217 Gbits/sec
[  5]   7.00-8.00   sec  26.1 GBytes   224 Gbits/sec
[  5]   8.00-9.00   sec  25.9 GBytes   223 Gbits/sec
[  5]   9.00-10.00  sec  25.6 GBytes   220 Gbits/sec
[  5]  10.00-10.00  sec  28.3 MBytes   154 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec   256 GBytes   220 Gbits/sec

MTU=1500, gso_max_size=524280:
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  24.1 GBytes   207 Gbits/sec
[  5]   1.00-2.00   sec  24.1 GBytes   207 Gbits/sec
[  5]   2.00-3.00   sec  24.6 GBytes   211 Gbits/sec
[  5]   3.00-4.00   sec  24.4 GBytes   209 Gbits/sec
[  5]   4.00-5.00   sec  24.5 GBytes   210 Gbits/sec
[  5]   5.00-6.00   sec  24.3 GBytes   209 Gbits/sec
[  5]   6.00-7.00   sec  24.6 GBytes   211 Gbits/sec
[  5]   7.00-8.00   sec  24.0 GBytes   206 Gbits/sec
[  5]   8.00-9.00   sec  23.9 GBytes   205 Gbits/sec
[  5]   9.00-10.00  sec  24.1 GBytes   207 Gbits/sec
[  5]  10.00-10.00  sec  7.81 MBytes   120 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec   243 GBytes   208 Gbits/sec

[1] 20260205133925.526371-1-alice.kernel@fastmail.im (full link was blocked by gmail)

Mariusz Klimek (10):
  ipv6: do not fragment packets into jumbograms
  ipv6: allow route exceptions with MTUs above 65535
  ipv6: add jumbo payload option to non-gso jumbograms
  tcp: decouple TSO segment length from MSS
  tcp: split jumbograms with urgent pointer correctly
  tcp: set MSS correctly for PMTU above 65535
  veth: raise the max MTU above 65535
  selftests/net: test sending TCP jumbograms over veth
  selftests/net: add test cases with MTU above 65535 to big_tcp.sh
  selftests/net: add jumbogram test case to msg_zerocopy.sh

 drivers/net/veth.c                          |   8 +-
 include/linux/ipv6.h                        |   6 +
 include/net/ip6_route.h                     |   5 +-
 include/net/ipv6.h                          |  11 +
 include/net/tcp.h                           |  12 +-
 net/ipv4/tcp.c                              |  12 +-
 net/ipv4/tcp_output.c                       |  84 +++--
 net/ipv4/tcp_timer.c                        |   4 +-
 net/ipv6/ip6_output.c                       |  24 +-
 net/ipv6/route.c                            |   2 +-
 tools/testing/selftests/net/Makefile        |   3 +
 tools/testing/selftests/net/big_tcp.sh      |  24 +-
 tools/testing/selftests/net/jumbogram.bpf.c |  36 ++
 tools/testing/selftests/net/jumbogram.sh    | 380 ++++++++++++++++++++
 tools/testing/selftests/net/jumbogram_rx.c  | 199 ++++++++++
 tools/testing/selftests/net/jumbogram_tx.c  | 139 +++++++
 tools/testing/selftests/net/msg_zerocopy.c  |   3 +-
 tools/testing/selftests/net/msg_zerocopy.sh |   3 +-
 18 files changed, 897 insertions(+), 58 deletions(-)
 create mode 100644 tools/testing/selftests/net/jumbogram.bpf.c
 create mode 100755 tools/testing/selftests/net/jumbogram.sh
 create mode 100644 tools/testing/selftests/net/jumbogram_rx.c
 create mode 100644 tools/testing/selftests/net/jumbogram_tx.c

-- 
2.47.3


             reply	other threads:[~2026-06-08 16:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08 16:33 Mariusz Klimek [this message]
2026-06-08 13:07 ` [PATCH net-next 01/10] ipv6: do not fragment packets into jumbograms Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 02/10] ipv6: allow route exceptions with MTUs above 65535 Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 03/10] ipv6: add jumbo payload option to non-gso jumbograms Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 04/10] tcp: decouple TSO segment length from MSS Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 05/10] tcp: split jumbograms with urgent pointer correctly Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 06/10] tcp: set MSS correctly for PMTU above 65535 Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 07/10] veth: raise the max MTU " Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 08/10] selftests/net: test sending TCP jumbograms over veth Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 09/10] selftests/net: add test cases with MTU above 65535 to big_tcp.sh Mariusz Klimek
2026-06-08 13:07 ` [PATCH net-next 10/10] selftests/net: add jumbogram test case to msg_zerocopy.sh Mariusz Klimek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260608130755.5626-1-maklimek97@gmail.com \
    --to=maklimek97@gmail.com \
    --cc=alice@isovalent.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox