From: Chuck Lever <cel@kernel.org>
To: john.fastabend@gmail.com, kuba@kernel.org, sd@queasysnail.net
Cc: netdev@vger.kernel.org, kernel-tls-handshake@lists.linux.dev,
Chuck Lever <chuck.lever@oracle.com>,
Hannes Reinecke <hare@suse.de>
Subject: [PATCH v2 8/8] tls: Enable batch async decryption in read_sock
Date: Tue, 10 Mar 2026 20:19:52 -0400 [thread overview]
Message-ID: <20260311001952.57059-9-cel@kernel.org> (raw)
In-Reply-To: <20260311001952.57059-1-cel@kernel.org>
From: Chuck Lever <chuck.lever@oracle.com>
tls_sw_read_sock() decrypts one TLS record at a time, blocking until
each AEAD operation completes before proceeding. Hardware async
crypto engines depend on pipelining multiple operations to achieve
full throughput, and the one-at-a-time model prevents that. Kernel
consumers such as NVMe-TCP and NFSD (when using TLS) are therefore
unable to benefit from hardware offload.
When ctx->async_capable is true, the submit phase now loops up to
TLS_READ_SOCK_BATCH (16) records. The first record waits via
tls_rx_rec_wait(); subsequent iterations use tls_strp_msg_ready()
and tls_strp_check_rcv() to collect records already queued on the
socket without blocking. Each record is submitted with darg.async
set, and all resulting skbs are appended to rx_list.
After the submit loop, a single tls_decrypt_async_drain() collects
all pending AEAD completions before the deliver phase passes
cleartext records to the consumer. The batch bound of 16 limits
concurrent memory consumption to 16 cleartext skbs plus their AEAD
contexts. If async_capable is false, the loop exits after one
record and the async wait is skipped, preserving prior behavior.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/tls/tls_sw.c | 95 +++++++++++++++++++++++++++++++++++++++---------
1 file changed, 78 insertions(+), 17 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 535c856d64e0..0e2b7d285d06 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -261,6 +261,12 @@ static int tls_decrypt_async_drain(struct tls_sw_context_rx *ctx)
return ret;
}
+/* Submit an AEAD decrypt request. On success with darg->async set,
+ * the caller must not touch aead_req; the completion handler frees
+ * it. Every error return clears darg->async and guarantees no
+ * in-flight AEAD operation remains -- callers rely on this to
+ * safely free aead_req and to skip async drain on error paths.
+ */
static int tls_do_decryption(struct sock *sk,
struct scatterlist *sgin,
struct scatterlist *sgout,
@@ -2347,6 +2353,13 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
goto splice_read_end;
}
+/* Bound on concurrent async AEAD submissions per read_sock
+ * call. Chosen to fill typical hardware crypto pipelines
+ * without excessive memory consumption (each in-flight record
+ * holds one cleartext skb plus its AEAD request context).
+ */
+#define TLS_READ_SOCK_BATCH 16
+
int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
sk_read_actor_t read_actor)
{
@@ -2358,6 +2371,7 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
struct sk_psock *psock;
size_t flushed_at = 0;
bool released = true;
+ bool async = false;
struct tls_msg *tlm;
ssize_t copied = 0;
ssize_t decrypted;
@@ -2380,31 +2394,68 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
decrypted = 0;
for (;;) {
struct tls_decrypt_arg darg;
+ int nr_async = 0;
- /* Phase 1: Submit -- decrypt one record onto rx_list.
+ /* Phase 1: Submit -- decrypt records onto rx_list.
* Flush the backlog first so that segments that
* arrived while the lock was held appear on
* sk_receive_queue before tls_rx_rec_wait waits
* for a new record.
*/
if (skb_queue_empty(&ctx->rx_list)) {
- sk_flush_backlog(sk);
- err = tls_rx_rec_wait(sk, NULL, true, released);
- if (err <= 0)
+ while (nr_async < TLS_READ_SOCK_BATCH) {
+ if (nr_async == 0) {
+ sk_flush_backlog(sk);
+ err = tls_rx_rec_wait(sk, NULL,
+ true,
+ released);
+ if (err <= 0)
+ goto read_sock_end;
+ } else {
+ if (!tls_strp_msg_ready(ctx)) {
+ tls_strp_check_rcv_quiet(&ctx->strp);
+ if (!tls_strp_msg_ready(ctx))
+ break;
+ }
+ if (!tls_strp_msg_load(&ctx->strp,
+ released))
+ break;
+ }
+
+ memset(&darg.inargs, 0, sizeof(darg.inargs));
+ darg.async = ctx->async_capable;
+
+ err = tls_rx_decrypt_record(sk, NULL,
+ &darg);
+ if (err < 0)
+ goto read_sock_end;
+
+ async |= darg.async;
+ released = tls_read_flush_backlog(sk, prot,
+ INT_MAX,
+ 0,
+ decrypted,
+ &flushed_at);
+ decrypted += strp_msg(darg.skb)->full_len;
+ tls_rx_rec_release(ctx);
+ __skb_queue_tail(&ctx->rx_list, darg.skb);
+ nr_async++;
+
+ if (!ctx->async_capable)
+ break;
+ }
+ }
+
+ /* Async wait -- collect pending AEAD completions */
+ if (async) {
+ int ret = tls_decrypt_async_drain(ctx);
+
+ async = false;
+ if (ret) {
+ __skb_queue_purge(&ctx->rx_list);
+ err = ret;
goto read_sock_end;
-
- memset(&darg.inargs, 0, sizeof(darg.inargs));
-
- err = tls_rx_decrypt_record(sk, NULL, &darg);
- if (err < 0)
- goto read_sock_end;
-
- released = tls_read_flush_backlog(sk, prot, INT_MAX,
- 0, decrypted,
- &flushed_at);
- decrypted += strp_msg(darg.skb)->full_len;
- tls_rx_rec_release(ctx);
- __skb_queue_tail(&ctx->rx_list, darg.skb);
+ }
}
/* Phase 2: Deliver -- drain rx_list to read_actor */
@@ -2442,6 +2493,16 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
}
read_sock_end:
+ if (async) {
+ int ret = tls_decrypt_async_drain(ctx);
+
+ __skb_queue_purge(&ctx->rx_list);
+ /* Preserve the error that triggered early exit;
+ * a crypto drain error is secondary.
+ */
+ if (ret && !err)
+ err = ret;
+ }
tls_strp_check_rcv(&ctx->strp);
tls_rx_reader_release(sk, ctx);
return copied ? : err;
--
2.53.0
next prev parent reply other threads:[~2026-03-11 0:20 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 0:19 [PATCH v2 0/8] TLS read_sock performance scalability Chuck Lever
2026-03-11 0:19 ` [PATCH v2 1/8] tls: Factor tls_decrypt_async_drain() from recvmsg Chuck Lever
2026-03-11 17:24 ` Hannes Reinecke
2026-03-11 0:19 ` [PATCH v2 2/8] tls: Factor tls_rx_decrypt_record() helper Chuck Lever
2026-03-11 17:25 ` Hannes Reinecke
2026-03-11 0:19 ` [PATCH v2 3/8] tls: Fix dangling skb pointer in tls_sw_read_sock() Chuck Lever
2026-03-11 0:19 ` [PATCH v2 4/8] tls: Factor tls_strp_msg_release() from tls_strp_msg_done() Chuck Lever
2026-03-11 0:19 ` [PATCH v2 5/8] tls: Suppress spurious saved_data_ready on all receive paths Chuck Lever
2026-03-11 0:19 ` [PATCH v2 6/8] tls: Flush backlog before tls_rx_rec_wait in read_sock Chuck Lever
2026-03-11 0:19 ` [PATCH v2 7/8] tls: Restructure tls_sw_read_sock() into submit/deliver phases Chuck Lever
2026-03-11 0:19 ` Chuck Lever [this message]
2026-03-11 20:42 ` [PATCH v2 0/8] TLS read_sock performance scalability Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260311001952.57059-9-cel@kernel.org \
--to=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=hare@suse.de \
--cc=john.fastabend@gmail.com \
--cc=kernel-tls-handshake@lists.linux.dev \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sd@queasysnail.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox