From: Chuck Lever <chuck.lever@oracle.com>
To: Chuck Lever <cel@kernel.org>, NeilBrown <neil@brown.name>,
Jeff Layton <jlayton@kernel.org>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>
Subject: Re: [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied
Date: Thu, 14 May 2026 16:57:46 -0400 [thread overview]
Message-ID: <4890ba05-19ab-484b-a123-6649900ac34e@oracle.com> (raw)
In-Reply-To: <20260514205607.348291-3-cel@kernel.org>
On 5/14/26 4:56 PM, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> The sk_err check in tls_rx_rec_wait() consumes the error via
> sock_error(), which clears sk_err atomically. When the caller
> (tls_sw_recvmsg, tls_sw_splice_read, or tls_sw_read_sock) already
> has bytes copied to userspace, it returns those bytes and discards
> the error from this call. sk_err is now zero on the socket, so the
> next read syscall observes only RCV_SHUTDOWN and reports a clean
> EOF instead of the actual error (typically -ECONNRESET).
>
> The race is reachable when tls_read_flush_backlog()'s periodic
> sk_flush_backlog() triggers tcp_reset() in the middle of a
> multi-record read.
>
> Pass a has_copied flag to tls_rx_rec_wait(). When has_copied is
> false, consume sk_err via sock_error() as before. When has_copied
> is true, report the error from READ_ONCE() but leave sk_err set:
> the caller returns the byte count and discards the err from this
> call, and the next read syscall surfaces the preserved sk_err. This
> mirrors the tcp_recvmsg() preserve-and-surface pattern.
>
> The decrypt-abort path is unaffected: tls_err_abort() raises
> sk_err to EBADMSG after tls_rx_rec_wait() returns, and nothing
> on the caller's return path consumes it, so the EBADMSG surfaces
> on the next read.
>
> tls_sw_splice_read() passes has_copied=false: it processes
> one record per call, so no bytes have been copied within the
> function when tls_rx_rec_wait() runs. A reset that arrives
> between iterations of splice_direct_to_actor() (the sendfile()
> path) is still consumed by sock_error() in the later call, and the
> outer loop returns the prior iterations' byte count and drops the
> error. tcp_splice_read() exhibits the same pattern at the iteration
> boundary; addressing it belongs at the splice_direct_to_actor()
> layer and is out of scope here.
>
> Fixes: c46b01839f7a ("tls: rx: periodically flush socket backlog")
> Suggested-by: Jakub Kicinski <kuba@kernel.org>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> net/tls/tls_sw.c | 20 ++++++++++++++------
> 1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 2590e855f6a5..c4cc4e357848 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1356,9 +1356,14 @@ void tls_sw_splice_eof(struct socket *sock)
> mutex_unlock(&tls_ctx->tx_lock);
> }
>
> +/* When has_copied is true the caller has already moved bytes to
> + * userspace. Report sk_err but leave it set so the next read
> + * surfaces it instead of a spurious EOF, otherwise sk_err is
> + * consumed via sock_error().
> + */
> static int
> tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
> - bool released)
> + bool released, bool has_copied)
> {
> struct tls_context *tls_ctx = tls_get_ctx(sk);
> struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
> @@ -1376,8 +1381,11 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
> if (!sk_psock_queue_empty(psock))
> return 0;
>
> - if (sk->sk_err)
> + if (sk->sk_err) {
> + if (has_copied)
> + return -READ_ONCE(sk->sk_err);
> return sock_error(sk);
> + }
>
> if (ret < 0)
> return ret;
> @@ -1413,7 +1421,7 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
> }
>
> if (unlikely(!tls_strp_msg_load(&ctx->strp, released)))
> - return tls_rx_rec_wait(sk, psock, nonblock, false);
> + return tls_rx_rec_wait(sk, psock, nonblock, false, has_copied);
>
> return 1;
> }
> @@ -2100,7 +2108,7 @@ int tls_sw_recvmsg(struct sock *sk,
> int to_decrypt, chunk;
>
> err = tls_rx_rec_wait(sk, psock, flags & MSG_DONTWAIT,
> - released);
> + released, !!(decrypted + copied));
> if (err <= 0) {
> if (psock) {
> chunk = sk_msg_recvmsg(sk, psock, msg, len,
> @@ -2287,7 +2295,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
> struct tls_decrypt_arg darg;
>
> err = tls_rx_rec_wait(sk, NULL, flags & SPLICE_F_NONBLOCK,
> - true);
> + true, false);
> if (err <= 0)
> goto splice_read_end;
>
> @@ -2373,7 +2381,7 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
> } else {
> struct tls_decrypt_arg darg;
>
> - err = tls_rx_rec_wait(sk, NULL, true, released);
> + err = tls_rx_rec_wait(sk, NULL, true, released, !!copied);
> if (err <= 0)
> goto read_sock_end;
>
Ignore this email. It looks like something left over from a previous
posting.
--
Chuck Lever
next prev parent reply other threads:[~2026-05-14 20:58 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-14 20:56 [PATCH 0/3] three lockd fixes Chuck Lever
2026-05-14 20:56 ` [PATCH 1/3] lockd: Plug nlm_file leak when nlm_do_fopen() fails Chuck Lever
2026-05-14 20:56 ` [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied Chuck Lever
2026-05-14 20:57 ` Chuck Lever [this message]
2026-05-14 20:56 ` [PATCH 2/3] lockd: Plug nlm_file refcount leak on cached nlm_do_fopen() failure Chuck Lever
2026-05-14 20:56 ` [PATCH 3/3] lockd: Avoid hashing uninitialized bytes in nlm4svc_lookup_file() Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4890ba05-19ab-484b-a123-6649900ac34e@oracle.com \
--to=chuck.lever@oracle.com \
--cc=cel@kernel.org \
--cc=dai.ngo@oracle.com \
--cc=jlayton@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neil@brown.name \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox