From: Sabrina Dubroca <sd@queasysnail.net>
To: Wilfred Mallawa <wilfred.opensource@gmail.com>
Cc: "David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
John Fastabend <john.fastabend@gmail.com>,
Simon Horman <horms@kernel.org>,
netdev@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org,
Alistair Francis <alistair.francis@wdc.com>,
Damien Le'Moal <dlemoal@kernel.org>,
Wilfred Mallawa <wilfred.mallawa@wdc.com>
Subject: Re: [PATCH v2] net/tls: support maximum record size limit
Date: Tue, 2 Sep 2025 18:07:20 +0200 [thread overview]
Message-ID: <aLcWOJeAFeM6_U6w@krikkit> (raw)
In-Reply-To: <20250902033809.177182-2-wilfred.opensource@gmail.com>
2025-09-02, 13:38:10 +1000, Wilfred Mallawa wrote:
> From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
>
> During a handshake, an endpoint may specify a maximum record size limit.
> Currently, the kernel defaults to TLS_MAX_PAYLOAD_SIZE (16KB) for the
> maximum record size. Meaning that, the outgoing records from the kernel
> can exceed a lower size negotiated during the handshake. In such a case,
> the TLS endpoint must send a fatal "record_overflow" alert [1], and
> thus the record is discarded.
>
> Upcoming Western Digital NVMe-TCP hardware controllers implement TLS
> support. For these devices, supporting TLS record size negotiation is
> necessary because the maximum TLS record size supported by the controller
> is less than the default 16KB currently used by the kernel.
>
> This patch adds support for retrieving the negotiated record size limit
> during a handshake, and enforcing it at the TLS layer such that outgoing
> records are no larger than the size negotiated. This patch depends on
> the respective userspace support in tlshd and GnuTLS [2].
>
> [1] https://www.rfc-editor.org/rfc/rfc8449
> [2] https://gitlab.com/gnutls/gnutls/-/merge_requests/2005
>
> Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> ---
> Documentation/networking/tls.rst | 7 ++++++
> include/net/tls.h | 1 +
> include/uapi/linux/tls.h | 2 ++
> net/tls/tls_main.c | 39 ++++++++++++++++++++++++++++++--
> net/tls/tls_sw.c | 4 ++++
> 5 files changed, 51 insertions(+), 2 deletions(-)
A selftest would be nice (tools/testing/selftests/net/tls.c), but I'm
not sure what we could do on the "RX" side to check that we are
respecting the size restriction. Use a basic TCP socket and try to
parse (and then discard without decrypting) records manually out of
the stream and see if we got the length we wanted?
> diff --git a/include/net/tls.h b/include/net/tls.h
> index 857340338b69..c9a3759f27ca 100644
> --- a/include/net/tls.h
> +++ b/include/net/tls.h
> @@ -226,6 +226,7 @@ struct tls_context {
> u8 rx_conf:3;
> u8 zerocopy_sendfile:1;
> u8 rx_no_pad:1;
> + u16 record_size_limit;
Maybe "tx_record_size_limit", since it's not intended for RX?
I don't know if the kernel will ever have a need to enforce the RX
record size, but it would maybe avoid future head-scratching "why is
this not used on the RX path?"
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index a3ccb3135e51..1098c01f2749 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -812,6 +812,31 @@ static int do_tls_setsockopt_no_pad(struct sock *sk, sockptr_t optval,
> return rc;
> }
>
> +static int do_tls_setsockopt_record_size(struct sock *sk, sockptr_t optval,
> + unsigned int optlen)
> +{
> + struct tls_context *ctx = tls_get_ctx(sk);
> + u16 value;
> +
> + if (sockptr_is_null(optval) || optlen != sizeof(value))
> + return -EINVAL;
> +
> + if (copy_from_sockptr(&value, optval, sizeof(value)))
> + return -EFAULT;
> +
> + if (ctx->prot_info.version == TLS_1_2_VERSION &&
> + value > TLS_MAX_PAYLOAD_SIZE)
> + return -EINVAL;
> +
> + if (ctx->prot_info.version == TLS_1_3_VERSION &&
> + value > TLS_MAX_PAYLOAD_SIZE + 1)
> + return -EINVAL;
> +
> + ctx->record_size_limit = value;
> +
> + return 0;
> +}
> +
> static int do_tls_setsockopt(struct sock *sk, int optname, sockptr_t optval,
> unsigned int optlen)
> {
> @@ -833,6 +858,9 @@ static int do_tls_setsockopt(struct sock *sk, int optname, sockptr_t optval,
> case TLS_RX_EXPECT_NO_PAD:
> rc = do_tls_setsockopt_no_pad(sk, optval, optlen);
> break;
> + case TLS_TX_RECORD_SIZE_LIM:
> + rc = do_tls_setsockopt_record_size(sk, optval, optlen);
> + break;
Adding the corresponding changes to do_tls_getsockopt would also be good.
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index bac65d0d4e3e..9f9359f591d3 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1033,6 +1033,7 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
> unsigned char record_type = TLS_RECORD_TYPE_DATA;
> bool is_kvec = iov_iter_is_kvec(&msg->msg_iter);
> bool eor = !(msg->msg_flags & MSG_MORE);
> + u16 record_size_limit;
> size_t try_to_copy;
> ssize_t copied = 0;
> struct sk_msg *msg_pl, *msg_en;
> @@ -1058,6 +1059,9 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
> }
> }
>
> + record_size_limit = tls_ctx->record_size_limit ?
> + tls_ctx->record_size_limit : TLS_MAX_PAYLOAD_SIZE;
As Simon said (good catch Simon :)), this isn't used anywhere. Are you
sure this patch works? The previous version had a hunk in
tls_sw_sendmsg_locked that looks like what I would expect.
And the the offloaded TX path (in net/tls/tls_device.c) would also
need similar changes.
I'm wondering if it's better to add this conditional, or just
initialize record_size_limit to TLS_MAX_PAYLOAD_SIZE as we set up the
tls_context. Then we only have to replace TLS_MAX_PAYLOAD_SIZE with
tls_ctx->record_size_limit in a few places?
--
Sabrina
next prev parent reply other threads:[~2025-09-02 16:07 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-02 3:38 [PATCH v2] net/tls: support maximum record size limit Wilfred Mallawa
2025-09-02 11:40 ` Simon Horman
2025-09-02 22:05 ` Wilfred Mallawa
2025-09-02 16:07 ` Sabrina Dubroca [this message]
2025-09-02 22:50 ` Wilfred Mallawa
2025-09-03 8:21 ` Sabrina Dubroca
2025-09-04 23:31 ` Wilfred Mallawa
2025-09-07 22:13 ` Sabrina Dubroca
2025-09-02 21:24 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLcWOJeAFeM6_U6w@krikkit \
--to=sd@queasysnail.net \
--cc=alistair.francis@wdc.com \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=dlemoal@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=wilfred.mallawa@wdc.com \
--cc=wilfred.opensource@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.