From: Jakub Kicinski <kuba@kernel.org>
To: rjethwani@purestorage.com
Cc: Jakub Kicinski <kuba@kernel.org>,
netdev@vger.kernel.org, saeedm@nvidia.com, tariqt@nvidia.com,
mbloch@nvidia.com, borisp@nvidia.com, john.fastabend@gmail.com,
sd@queasysnail.net, davem@davemloft.net, pabeni@redhat.com,
edumazet@google.com, leon@kernel.org
Subject: Re: [PATCH v14 7/9] tls: device: add RX KeyUpdate support
Date: Mon, 25 May 2026 14:16:10 -0700 [thread overview]
Message-ID: <20260525211610.2508479-1-kuba@kernel.org> (raw)
In-Reply-To: <20260515212715.3151307-8-rjethwani@purestorage.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
tls: device: add RX KeyUpdate support
This adds RX-side TLS 1.3 KeyUpdate support to the device offload path,
keeping the old AEAD/IV/rec_seq around to handle in-flight records that
the NIC already decrypted with the retired key, and deferring tls_dev_add
of the new key until the old key has fully drained.
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> index c435b3450872..1c58cbd55ffb 100644
> --- a/net/tls/tls_device.c
> +++ b/net/tls/tls_device.c
[ ... ]
> @@ -1552,13 +1722,30 @@ int tls_set_device_offload(struct sock *sk,
> return rc;
> }
>
> -int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx)
> +int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx,
> + struct tls_crypto_info *new_crypto_info)
> {
[ ... ]
> + if (new_crypto_info) {
> + struct tls_sw_context_rx *sw_ctx = tls_sw_ctx_rx(ctx);
> +
> + if (!test_bit(TLS_RX_DEV_CLOSED, &ctx->flags)) {
> + set_bit(TLS_RX_DEV_CLOSED, &ctx->flags);
> + synchronize_net();
> + netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
> + TLS_OFFLOAD_CTX_DIR_RX);
> + }
> +
> + if (context->rekey.old_aead_recv &&
> + before(copied_seq, context->rekey.old_nic_boundary)) {
[ ... ]
> + context->dev_add_pending = 1;
> + } else {
> + u32 rcv_nxt;
> +
> + if (context->rekey.old_aead_recv) {
> + crypto_free_aead(context->rekey.old_aead_recv);
> + context->rekey.old_aead_recv = NULL;
> + }
> +
> + /* flush the backlog so rcv_nxt is accurate */
> + __sk_flush_backlog(sk);
> + rcv_nxt = tcp_sk(sk)->rcv_nxt;
> +
> + if (before(copied_seq, rcv_nxt)) {
> + context->rekey.old_aead_recv = sw_ctx->aead_recv;
> + sw_ctx->aead_recv = NULL;
> + memcpy(context->rekey.old_iv, ctx->rx.iv,
> + sizeof(context->rekey.old_iv));
> + memcpy(context->rekey.old_rec_seq, ctx->rx.rec_seq,
> + sizeof(context->rekey.old_rec_seq));
> + context->rekey.old_nic_boundary = rcv_nxt;
> + context->dev_add_pending = 1;
> + }
> + }
> + }
> +
> + rc = tls_sw_ctx_init(sk, 0, new_crypto_info);
> if (rc)
> goto release_ctx;
[High]
In the rekey branch, sw_ctx->aead_recv has just been moved into
context->rekey.old_aead_recv (with sw_ctx->aead_recv set to NULL) and the
NIC offload has already been torn down via tls_dev_del with
TLS_RX_DEV_CLOSED set.
If tls_sw_ctx_init() then fails (for example crypto_alloc_aead, setkey,
setauthsize, or an ENOMEM in the strparser init), control jumps to
release_ctx, and for the new_crypto_info case that label is essentially a
no-op:
release_ctx:
if (!new_crypto_info) {
kfree(ctx->priv_ctx_rx);
ctx->priv_ctx_rx = NULL;
}
So sw_ctx->aead_recv stays NULL, the rekey state stays populated,
TLS_RX_DEV_CLOSED stays set, and tls_sw_ctx_finalize() is never called.
The error returns to do_tls_setsockopt_conf() and, because
update && rx_conf == TLS_HW, it goes straight to err_crypto_info without
restoring anything, while rx_conf is still TLS_HW and the socket remains
open.
Can a subsequent recvmsg on this socket then NULL-deref in tls_decrypt_sg(),
which uses ctx->aead_recv unconditionally to compute
sizeof(*aead_req) + crypto_aead_reqsize(ctx->aead_recv)?
Should this path roll back the rekey state (restore sw_ctx->aead_recv from
rekey.old_aead_recv, clear rekey state, clear dev_add_pending) on
tls_sw_ctx_init() failure?
[High]
The error path renaming from tls_sw_free_resources_rx() to
tls_sw_release_resources_rx() in free_sw_resources, combined with a later
kfree(ctx->priv_ctx_rx), looks like it drops the strparser anchor cleanup:
> free_sw_resources:
> up_read(&device_offload_lock);
> - tls_sw_free_resources_rx(sk);
> + tls_sw_release_resources_rx(sk);
> down_read(&device_offload_lock);
> release_ctx:
> - ctx->priv_ctx_rx = NULL;
> + if (!new_crypto_info) {
> + kfree(ctx->priv_ctx_rx);
> + ctx->priv_ctx_rx = NULL;
> + }
tls_sw_free_resources_rx() previously did three things: release_resources_rx
(frees aead_recv, stops strparser), __tls_strp_done() ->
tls_strp_anchor_free() which consume_skb()s the anchor allocated in
tls_strp_init() during tls_sw_ctx_init(), and finally tls_sw_free_ctx_rx().
The new code only does steps one and three — calling kfree(priv_ctx_rx)
without first doing the anchor consume_skb().
Since tls_strp_init() runs inside tls_sw_ctx_init() during the
!new_crypto_info path, and free_sw_resources is reached only after
tls_device_dev_add_rx() fails (i.e. after init succeeded), is the
strp->anchor SKB leaked on every initial-setup failure here?
> + if (!context->dev_add_pending) {
> + rc = tls_device_dev_add_rx(sk, ctx, netdev, src_crypto_info,
> + copied_seq, !!new_crypto_info);
> + if (!new_crypto_info) {
> + if (rc)
> + goto free_sw_resources;
> + tls_device_attach(ctx, sk, netdev);
> + }
> + } else if (!was_dev_add_pending) {
> + TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXREKEYINPROGRESS);
> + } else {
> + TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXREKEYOK);
> + }
[Medium, Medium]
For the rekey path, context->dev_add_pending is set to 1 before
tls_sw_ctx_init() is called, but the matching
TLS_INC_STATS(LINUX_MIB_TLSRXREKEYINPROGRESS) only happens later in the
"else if (!was_dev_add_pending)" branch after init has succeeded.
If tls_sw_ctx_init() fails, the goto release_ctx is a no-op for
new_crypto_info, leaving dev_add_pending == 1 with no prior
REKEYINPROGRESS increment.
When the socket closes, tls_device_offload_cleanup_rx() then runs:
if (rx_ctx && rx_ctx->dev_add_pending) {
rx_ctx->dev_add_pending = 0;
TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXREKEYOK);
TLS_DEC_STATS(sock_net(sk), LINUX_MIB_TLSRXREKEYINPROGRESS);
}
Does this decrement TLSRXREKEYINPROGRESS without a prior matching
increment, wrapping the per-CPU SNMP counter to a very large unsigned
value, and bump TLSRXREKEYOK for a rekey that never installed a new key
on the NIC?
Separately, in the same cleanup tail, dev_add_pending == 1 means the
deferred tls_dev_add never ran, so the new key was not installed on the
NIC. Should this path really increment LINUX_MIB_TLSRXREKEYOK, given that
tls_device_dev_add_rx() is the documented place that bumps REKEYOK after
the NIC accepts the new key?
[Medium]
In the same hunk, when was_dev_add_pending is 1 (a previous rekey is still
draining) and a new rekey arrives, the third arm runs:
} else {
TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXREKEYOK);
}
dev_add_pending is still 1 here and the new key has not been installed on
the NIC yet. The eventual tls_device_deferred_dev_add_rx() will also call
tls_device_dev_add_rx() with is_rekey=true, which on success increments
LINUX_MIB_TLSRXREKEYOK again.
Is the same successful rekey getting counted twice in this case, and a
never-completed nested rekey getting counted as success?
next prev parent reply other threads:[~2026-05-25 21:16 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-15 21:27 [PATCH net-next v14 0/9] tls: Add TLS 1.3 hardware offload support Rishikesh Jethwani
2026-05-15 21:27 ` [PATCH v14 1/9] net: tls: reject TLS 1.3 offload in chcr_ktls and nfp drivers Rishikesh Jethwani
2026-05-15 21:27 ` [PATCH v14 2/9] net/mlx5e: add TLS 1.3 hardware offload support Rishikesh Jethwani
2026-05-15 21:27 ` [PATCH v14 3/9] tls: " Rishikesh Jethwani
2026-05-15 21:27 ` [PATCH v14 4/9] tls: split tls_set_sw_offload into init and finalize stages Rishikesh Jethwani
2026-05-15 21:27 ` [PATCH v14 5/9] tls: prep helpers and refactors for HW offload KeyUpdate Rishikesh Jethwani
2026-05-15 21:27 ` [PATCH v14 6/9] tls: device: add TX KeyUpdate support Rishikesh Jethwani
2026-05-25 21:16 ` Jakub Kicinski
2026-05-15 21:27 ` [PATCH v14 7/9] tls: device: add RX " Rishikesh Jethwani
2026-05-25 21:16 ` Jakub Kicinski [this message]
2026-05-15 21:27 ` [PATCH v14 8/9] tls: device: add tracepoints for RX KeyUpdate path Rishikesh Jethwani
2026-05-15 21:27 ` [PATCH v14 9/9] selftests: net: add TLS hardware offload test Rishikesh Jethwani
2026-05-25 21:16 ` Jakub Kicinski
2026-05-17 22:21 ` [PATCH net-next v14 0/9] tls: Add TLS 1.3 hardware offload support Sabrina Dubroca
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260525211610.2508479-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=borisp@nvidia.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=john.fastabend@gmail.com \
--cc=leon@kernel.org \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rjethwani@purestorage.com \
--cc=saeedm@nvidia.com \
--cc=sd@queasysnail.net \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox