Re: [PATCH v3] net/tls: support maximum record size limit

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Wilfred Mallawa <wilfred.opensource@gmail.com>
To: Sabrina Dubroca <sd@queasysnail.net>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, 	horms@kernel.org, corbet@lwn.net,
	john.fastabend@gmail.com, netdev@vger.kernel.org,
	 linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	 alistair.francis@wdc.com, dlemoal@kernel.org
Subject: Re: [PATCH v3] net/tls: support maximum record size limit
Date: Thu, 18 Sep 2025 10:52:14 +1000	[thread overview]
Message-ID: <05dc7efdb45363358825ff3782d3006ef9c6cea4.camel@gmail.com> (raw)
In-Reply-To: <aLgVCGbq0b6PJXbY@krikkit>

Hey Sabrina,

Sorry for the delay in getting back to this! Responded inline.

On Wed, 2025-09-03 at 12:14 +0200, Sabrina Dubroca wrote:
> note: since this is a new feature, the subject prefix should be
> "[PATCH net-next vN]" (ie add "net-next", the target tree for "new
> feature" changes)
> 
> 2025-09-03, 11:47:57 +1000, Wilfred Mallawa wrote:
> > diff --git a/Documentation/networking/tls.rst
> > b/Documentation/networking/tls.rst
> > index 36cc7afc2527..0232df902320 100644
> > --- a/Documentation/networking/tls.rst
> > +++ b/Documentation/networking/tls.rst
> > @@ -280,6 +280,13 @@ If the record decrypted turns out to had been
> > padded or is not a data
> >  record it will be decrypted again into a kernel buffer without
> > zero copy.
> >  Such events are counted in the ``TlsDecryptRetry`` statistic.
> >  
> > +TLS_TX_RECORD_SIZE_LIM
> > +~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +During a TLS handshake, an endpoint may use the record size limit
> > extension
> > +to specify a maximum record size. This allows enforcing the
> > specified record
> > +size limit, such that outgoing records do not exceed the limit
> > specified.
> 
> Maybe worth adding a reference to the RFC that defines this
> extension?
> I'm not sure if that would be helpful to readers of this doc or not.
Good idea, I'll add that in.
> 
> 
> > diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> > index a3ccb3135e51..94237c97f062 100644
> > --- a/net/tls/tls_main.c
> > +++ b/net/tls/tls_main.c
> [...]
> > @@ -1022,6 +1075,7 @@ static int tls_init(struct sock *sk)
> >  
> >  	ctx->tx_conf = TLS_BASE;
> >  	ctx->rx_conf = TLS_BASE;
> > +	ctx->tx_record_size_limit = TLS_MAX_PAYLOAD_SIZE;
> >  	update_sk_prot(sk, ctx);
> >  out:
> >  	write_unlock_bh(&sk->sk_callback_lock);
> > @@ -1065,7 +1119,7 @@ static u16 tls_user_config(struct tls_context
> > *ctx, bool tx)
> >  
> >  static int tls_get_info(struct sock *sk, struct sk_buff *skb, bool
> > net_admin)
> >  {
> > -	u16 version, cipher_type;
> > +	u16 version, cipher_type, tx_record_size_limit;
> >  	struct tls_context *ctx;
> >  	struct nlattr *start;
> >  	int err;
> > @@ -1110,7 +1164,13 @@ static int tls_get_info(struct sock *sk,
> > struct sk_buff *skb, bool net_admin)
> >  		if (err)
> >  			goto nla_failure;
> >  	}
> > -
> > +	tx_record_size_limit = ctx->tx_record_size_limit;
> > +	if (tx_record_size_limit) {
> 
> You probably meant to update that to:
> 
>     tx_record_size_limit != TLS_MAX_PAYLOAD_SIZE
> 
> Otherwise, now that the default is TLS_MAX_PAYLOAD_SIZE, it will
> always be exported - which is not wrong either. So I'd either update
> the conditional so that the attribute is only exported for non-
> default
> sizes (like in v2), or drop the if() and always export it.
> 
Yeah, that makes sense I'll drop the If() so that it's always exported
then.
> > +		err = nla_put_u16(skb,
> > TLS_INFO_TX_RECORD_SIZE_LIM,
> > +				  tx_record_size_limit);
> > +		if (err)
> > +			goto nla_failure;
> > +	}
> 
> [...]
> > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> > index bac65d0d4e3e..28fb796573d1 100644
> > --- a/net/tls/tls_sw.c
> > +++ b/net/tls/tls_sw.c
> > @@ -1079,7 +1079,7 @@ static int tls_sw_sendmsg_locked(struct sock
> > *sk, struct msghdr *msg,
> >  		orig_size = msg_pl->sg.size;
> >  		full_record = false;
> >  		try_to_copy = msg_data_left(msg);
> > -		record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl-
> > >sg.size;
> > +		record_room = tls_ctx->tx_record_size_limit -
> > msg_pl->sg.size;
> 
> If we entered tls_sw_sendmsg_locked with an existing open record,
> this
> could end up being negative and confuse the rest of the code.
> 
>     send(MSG_MORE) returns with an open record of length len1
>     setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM, limit < len1)
>     send() -> record_room < 0
> 
> 
> Possibly not a problem with a "well-behaved" userspace, but we can't
> rely on that.
Good catch! what if we don't allow tx_record_size_limit to be set if
there's a pending open record. This should avoid userspace from atleast
causing the record_room < 0 if we somehow end up there.

So for example:

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 7c0367dc5d40..34bb6690016c 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -841,20 +841,27 @@ static int
do_tls_setsockopt_tx_record_size(struct sock *sk, sockptr_t optval,
                                            unsigned int optlen)
 {
        struct tls_context *ctx = tls_get_ctx(sk);
+       struct tls_sw_context_tx *sw_ctx = tls_sw_ctx_tx(ctx);
        u16 value;
 
+       if (sw_ctx->open_rec)
+               return -EBUSY;
...

And to your follow up response:

```
> I suspect it's not a problem in practice because of what the TLS
> exchange between the peers setting up this extension looks like? (ie,
> there should never be an open record at this stage - unless userspace
> delays doing this setsockopt after getting the message from the peer,
> but then maybe we can call that a buggy userspace)
```

Yeah, record size limit extension occurs during a handshake
(Client/ServerHello). AFAIK, all of that is handled in tlshd/GnuTLS. We
shouldn't have any open records here at this point. For user-space
context, this is what support for record size limit looks like [1] in
tlshd.

If for whatever reason, as you mentioned, userspace decides to set it
later, change above could mitigate it for the open record case? I don't
think we need to try to fix things (or even can for records already
submitted to TCP) in the kernel.

[1]
WIP:https://github.com/twilfredo/ktls-utils/commit/73cb755acb4589ba31e4c42ef6b16cf5efdf3892
> 
> 
> Pushing out the pending "too big" record at the time we set
> tx_record_size_limit would likely make the peer close the connection
> (because it's already told us to limit our TX size), so I guess we'd
> have to split the pending record into tx_record_size_limit chunks
> before we start processing the new message (either directly at
> setsockopt(TLS_INFO_TX_RECORD_SIZE_LIM) time, or the next send/etc
> call). The final push during socket closing, and maybe some more
> codepaths that deal with ctx->open_rec, would also have to do that.
> 
> I think additional selftests for
>     send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, send
> and
>     send(MSG_MORE), TLS_INFO_TX_RECORD_SIZE_LIM, close
> verifying the received record sizes would make sense, since it's a
> bit
> tricky to get that right.
Yeah I agree, I will work on that. Thanks for the feedback!

Regards,
Wilfred

next prev parent reply	other threads:[~2025-09-18  0:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-03  1:47 [PATCH v3] net/tls: support maximum record size limit Wilfred Mallawa
2025-09-03 10:14 ` Sabrina Dubroca
2025-09-04  9:54   ` Sabrina Dubroca
2025-09-18  0:52   ` Wilfred Mallawa [this message]
2025-09-03 22:51 ` Jakub Kicinski
2025-09-04 23:41   ` Wilfred Mallawa
2025-09-04 10:10 ` Sabrina Dubroca
2025-09-04 13:33   ` Jakub Kicinski
2025-09-18  1:42   ` Wilfred Mallawa

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:7c0367dc5d4 dfblob:34bb6690016 )
 OR (
bs:"Re: [PATCH v3] net/tls: support maximum record size limit" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=05dc7efdb45363358825ff3782d3006ef9c6cea4.camel@gmail.com \
    --to=wilfred.opensource@gmail.com \
    --cc=alistair.francis@wdc.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=dlemoal@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sd@queasysnail.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).