* [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied
@ 2026-05-13 12:58 Chuck Lever
2026-05-15 1:30 ` patchwork-bot+netdevbpf
0 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever @ 2026-05-13 12:58 UTC (permalink / raw)
To: john.fastabend, kuba, sd, davem, edumazet, pabeni
Cc: horms, netdev, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
The sk_err check in tls_rx_rec_wait() consumes the error via
sock_error(), which clears sk_err atomically. When the caller
(tls_sw_recvmsg, tls_sw_splice_read, or tls_sw_read_sock) already
has bytes copied to userspace, it returns those bytes and discards
the error from this call. sk_err is now zero on the socket, so the
next read syscall observes only RCV_SHUTDOWN and reports a clean
EOF instead of the actual error (typically -ECONNRESET).
The race is reachable when tls_read_flush_backlog()'s periodic
sk_flush_backlog() triggers tcp_reset() in the middle of a
multi-record read.
Pass a has_copied flag to tls_rx_rec_wait(). When has_copied is
false, consume sk_err via sock_error() as before. When has_copied
is true, report the error from READ_ONCE() but leave sk_err set:
the caller returns the byte count and discards the err from this
call, and the next read syscall surfaces the preserved sk_err. This
mirrors the tcp_recvmsg() preserve-and-surface pattern.
The decrypt-abort path is unaffected: tls_err_abort() raises
sk_err to EBADMSG after tls_rx_rec_wait() returns, and nothing
on the caller's return path consumes it, so the EBADMSG surfaces
on the next read.
tls_sw_splice_read() passes has_copied=false: it processes
one record per call, so no bytes have been copied within the
function when tls_rx_rec_wait() runs. A reset that arrives
between iterations of splice_direct_to_actor() (the sendfile()
path) is still consumed by sock_error() in the later call, and the
outer loop returns the prior iterations' byte count and drops the
error. tcp_splice_read() exhibits the same pattern at the iteration
boundary; addressing it belongs at the splice_direct_to_actor()
layer and is out of scope here.
Fixes: c46b01839f7a ("tls: rx: periodically flush socket backlog")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/tls/tls_sw.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 2590e855f6a5..c4cc4e357848 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1356,9 +1356,14 @@ void tls_sw_splice_eof(struct socket *sock)
mutex_unlock(&tls_ctx->tx_lock);
}
+/* When has_copied is true the caller has already moved bytes to
+ * userspace. Report sk_err but leave it set so the next read
+ * surfaces it instead of a spurious EOF, otherwise sk_err is
+ * consumed via sock_error().
+ */
static int
tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
- bool released)
+ bool released, bool has_copied)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
@@ -1376,8 +1381,11 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
if (!sk_psock_queue_empty(psock))
return 0;
- if (sk->sk_err)
+ if (sk->sk_err) {
+ if (has_copied)
+ return -READ_ONCE(sk->sk_err);
return sock_error(sk);
+ }
if (ret < 0)
return ret;
@@ -1413,7 +1421,7 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
}
if (unlikely(!tls_strp_msg_load(&ctx->strp, released)))
- return tls_rx_rec_wait(sk, psock, nonblock, false);
+ return tls_rx_rec_wait(sk, psock, nonblock, false, has_copied);
return 1;
}
@@ -2100,7 +2108,7 @@ int tls_sw_recvmsg(struct sock *sk,
int to_decrypt, chunk;
err = tls_rx_rec_wait(sk, psock, flags & MSG_DONTWAIT,
- released);
+ released, !!(decrypted + copied));
if (err <= 0) {
if (psock) {
chunk = sk_msg_recvmsg(sk, psock, msg, len,
@@ -2287,7 +2295,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
struct tls_decrypt_arg darg;
err = tls_rx_rec_wait(sk, NULL, flags & SPLICE_F_NONBLOCK,
- true);
+ true, false);
if (err <= 0)
goto splice_read_end;
@@ -2373,7 +2381,7 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
} else {
struct tls_decrypt_arg darg;
- err = tls_rx_rec_wait(sk, NULL, true, released);
+ err = tls_rx_rec_wait(sk, NULL, true, released, !!copied);
if (err <= 0)
goto read_sock_end;
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 0/3] three lockd fixes
@ 2026-05-14 20:56 Chuck Lever
2026-05-14 20:56 ` [PATCH 1/3] lockd: Plug nlm_file leak when nlm_do_fopen() fails Chuck Lever
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-14 20:56 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
Jeff reported an issue earlier today, and while crafting the fix,
LLM review found two more pre-existing nearby problems.
Chuck Lever (3):
lockd: Plug nlm_file leak when nlm_do_fopen() fails
lockd: Plug nlm_file refcount leak on cached nlm_do_fopen() failure
lockd: Avoid hashing uninitialized bytes in nlm4svc_lookup_file()
fs/lockd/lockd.h | 8 ++++++++
fs/lockd/svc4proc.c | 3 +++
fs/lockd/svcsubs.c | 7 ++++---
3 files changed, 15 insertions(+), 3 deletions(-)
--
2.54.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] lockd: Plug nlm_file leak when nlm_do_fopen() fails
2026-05-14 20:56 [PATCH 0/3] three lockd fixes Chuck Lever
@ 2026-05-14 20:56 ` Chuck Lever
2026-05-14 20:56 ` [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied Chuck Lever
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-14 20:56 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
A client can repeatedly drive nlm_do_fopen() failures by presenting
file handles that the underlying export rejects. After kzalloc_obj()
succeeds in nlm_lookup_file(), the freshly allocated nlm_file is not
yet inserted into nlm_files[]. The nlm_do_fopen() failure path jumps
to out_unlock, which releases nlm_file_mutex and returns without
freeing the allocation, so each failure leaks one nlm_file.
Route the failure through out_free so kfree() runs before the
function returns.
Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/lockd/svcsubs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index e24bacea7e03..0b81d8db0919 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -166,7 +166,7 @@ nlm_lookup_file(struct svc_rqst *rqstp, struct nlm_file **result,
nfserr = nlm_do_fopen(rqstp, file, mode);
if (nfserr)
- goto out_unlock;
+ goto out_free;
hlist_add_head(&file->f_list, &nlm_files[hash]);
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied
2026-05-14 20:56 [PATCH 0/3] three lockd fixes Chuck Lever
2026-05-14 20:56 ` [PATCH 1/3] lockd: Plug nlm_file leak when nlm_do_fopen() fails Chuck Lever
@ 2026-05-14 20:56 ` Chuck Lever
2026-05-14 20:57 ` Chuck Lever
2026-05-14 20:56 ` [PATCH 2/3] lockd: Plug nlm_file refcount leak on cached nlm_do_fopen() failure Chuck Lever
2026-05-14 20:56 ` [PATCH 3/3] lockd: Avoid hashing uninitialized bytes in nlm4svc_lookup_file() Chuck Lever
3 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever @ 2026-05-14 20:56 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever, Jakub Kicinski
From: Chuck Lever <chuck.lever@oracle.com>
The sk_err check in tls_rx_rec_wait() consumes the error via
sock_error(), which clears sk_err atomically. When the caller
(tls_sw_recvmsg, tls_sw_splice_read, or tls_sw_read_sock) already
has bytes copied to userspace, it returns those bytes and discards
the error from this call. sk_err is now zero on the socket, so the
next read syscall observes only RCV_SHUTDOWN and reports a clean
EOF instead of the actual error (typically -ECONNRESET).
The race is reachable when tls_read_flush_backlog()'s periodic
sk_flush_backlog() triggers tcp_reset() in the middle of a
multi-record read.
Pass a has_copied flag to tls_rx_rec_wait(). When has_copied is
false, consume sk_err via sock_error() as before. When has_copied
is true, report the error from READ_ONCE() but leave sk_err set:
the caller returns the byte count and discards the err from this
call, and the next read syscall surfaces the preserved sk_err. This
mirrors the tcp_recvmsg() preserve-and-surface pattern.
The decrypt-abort path is unaffected: tls_err_abort() raises
sk_err to EBADMSG after tls_rx_rec_wait() returns, and nothing
on the caller's return path consumes it, so the EBADMSG surfaces
on the next read.
tls_sw_splice_read() passes has_copied=false: it processes
one record per call, so no bytes have been copied within the
function when tls_rx_rec_wait() runs. A reset that arrives
between iterations of splice_direct_to_actor() (the sendfile()
path) is still consumed by sock_error() in the later call, and the
outer loop returns the prior iterations' byte count and drops the
error. tcp_splice_read() exhibits the same pattern at the iteration
boundary; addressing it belongs at the splice_direct_to_actor()
layer and is out of scope here.
Fixes: c46b01839f7a ("tls: rx: periodically flush socket backlog")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/tls/tls_sw.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 2590e855f6a5..c4cc4e357848 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1356,9 +1356,14 @@ void tls_sw_splice_eof(struct socket *sock)
mutex_unlock(&tls_ctx->tx_lock);
}
+/* When has_copied is true the caller has already moved bytes to
+ * userspace. Report sk_err but leave it set so the next read
+ * surfaces it instead of a spurious EOF, otherwise sk_err is
+ * consumed via sock_error().
+ */
static int
tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
- bool released)
+ bool released, bool has_copied)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
@@ -1376,8 +1381,11 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
if (!sk_psock_queue_empty(psock))
return 0;
- if (sk->sk_err)
+ if (sk->sk_err) {
+ if (has_copied)
+ return -READ_ONCE(sk->sk_err);
return sock_error(sk);
+ }
if (ret < 0)
return ret;
@@ -1413,7 +1421,7 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
}
if (unlikely(!tls_strp_msg_load(&ctx->strp, released)))
- return tls_rx_rec_wait(sk, psock, nonblock, false);
+ return tls_rx_rec_wait(sk, psock, nonblock, false, has_copied);
return 1;
}
@@ -2100,7 +2108,7 @@ int tls_sw_recvmsg(struct sock *sk,
int to_decrypt, chunk;
err = tls_rx_rec_wait(sk, psock, flags & MSG_DONTWAIT,
- released);
+ released, !!(decrypted + copied));
if (err <= 0) {
if (psock) {
chunk = sk_msg_recvmsg(sk, psock, msg, len,
@@ -2287,7 +2295,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
struct tls_decrypt_arg darg;
err = tls_rx_rec_wait(sk, NULL, flags & SPLICE_F_NONBLOCK,
- true);
+ true, false);
if (err <= 0)
goto splice_read_end;
@@ -2373,7 +2381,7 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
} else {
struct tls_decrypt_arg darg;
- err = tls_rx_rec_wait(sk, NULL, true, released);
+ err = tls_rx_rec_wait(sk, NULL, true, released, !!copied);
if (err <= 0)
goto read_sock_end;
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/3] lockd: Plug nlm_file refcount leak on cached nlm_do_fopen() failure
2026-05-14 20:56 [PATCH 0/3] three lockd fixes Chuck Lever
2026-05-14 20:56 ` [PATCH 1/3] lockd: Plug nlm_file leak when nlm_do_fopen() fails Chuck Lever
2026-05-14 20:56 ` [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied Chuck Lever
@ 2026-05-14 20:56 ` Chuck Lever
2026-05-14 20:56 ` [PATCH 3/3] lockd: Avoid hashing uninitialized bytes in nlm4svc_lookup_file() Chuck Lever
3 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-14 20:56 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
The cached-file path in nlm_lookup_file() reaches the found: label
unconditionally, even when nlm_do_fopen() fails. At that label
*result and file->f_count are updated before the error is returned.
The wrappers nlm3svc_lookup_file() and nlm4svc_lookup_file() then
bail out of their switch without copying *result back to their
caller, so the proc handler's local nlm_file pointer remains NULL
and the cleanup path skips nlm_release_file(). The f_count
increment is never released, and nlm_traverse_files() can no
longer reap the file because its refcount never returns to zero
between requests.
Short-circuit the cached path so neither *result nor f_count is
touched when nlm_do_fopen() fails on a hashed nlm_file.
Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/lockd/svcsubs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index 0b81d8db0919..58b87ec52930 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -150,6 +150,8 @@ nlm_lookup_file(struct svc_rqst *rqstp, struct nlm_file **result,
mutex_lock(&file->f_mutex);
nfserr = nlm_do_fopen(rqstp, file, mode);
mutex_unlock(&file->f_mutex);
+ if (nfserr)
+ goto out_unlock;
goto found;
}
nlm_debug_print_fh("creating file for", &lock->fh);
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] lockd: Avoid hashing uninitialized bytes in nlm4svc_lookup_file()
2026-05-14 20:56 [PATCH 0/3] three lockd fixes Chuck Lever
` (2 preceding siblings ...)
2026-05-14 20:56 ` [PATCH 2/3] lockd: Plug nlm_file refcount leak on cached nlm_do_fopen() failure Chuck Lever
@ 2026-05-14 20:56 ` Chuck Lever
3 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-14 20:56 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
file_hash() digests the first LOCKD_FH_HASH_SIZE bytes of
nfs_fh.data when bucketing nlm_files[], independent of fh.size.
Commit 3de744ee4e45 ("lockd: Use xdrgen XDR functions for the
NLMv4 TEST procedure") set .pc_argzero to zero for the converted
procedures and moved file-handle population into
nlm4svc_lookup_file(), which copies only xdr_lock->fh.len bytes
into lock->fh.data.
When an NLMv4 client presents a file handle shorter than
LOCKD_FH_HASH_SIZE, bytes fh.len..31 retain whatever the argument
buffer held from an earlier request. The same wire handle then
hashes to different buckets across calls; nlm_lookup_file() misses
the existing nlm_file entry, and lock-state lookups fail.
Zero only the tail bytes that file_hash() would otherwise consume.
Handles of LOCKD_FH_HASH_SIZE or larger already populate every byte
that file_hash() reads.
Reported-by: Jeff Layton <jlayton@kernel.org>
Closes: https://lore.kernel.org/r/5229a9746d723a3f830120c0b966510f75badfc2.camel@kernel.org
Fixes: 3de744ee4e45 ("lockd: Use xdrgen XDR functions for the NLMv4 TEST procedure")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/lockd/lockd.h | 8 ++++++++
fs/lockd/svc4proc.c | 3 +++
fs/lockd/svcsubs.c | 3 +--
3 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/fs/lockd/lockd.h b/fs/lockd/lockd.h
index 0be0dac59ea2..e418a50c4180 100644
--- a/fs/lockd/lockd.h
+++ b/fs/lockd/lockd.h
@@ -52,6 +52,14 @@
*/
#define LOCKD_DFLT_TIMEO 10
+/*
+ * Number of leading bytes of nfs_fh.data that file_hash()
+ * digests when bucketing nlm_files[]. Sized for historical
+ * NFSv2 handles; nfs_fh.data must be initialized at least
+ * this far before lookup, regardless of fh.size.
+ */
+#define LOCKD_FH_HASH_SIZE 32
+
/* error codes new to NLMv4 */
#define nlm4_deadlock cpu_to_be32(NLM_DEADLCK)
#define nlm4_rofs cpu_to_be32(NLM_ROFS)
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 997f4f437997..78e675470c4b 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -156,6 +156,9 @@ nlm4svc_lookup_file(struct svc_rqst *rqstp, struct nlm_host *host,
return nlm_lck_denied_nolocks;
lock->fh.size = xdr_lock->fh.len;
memcpy(lock->fh.data, xdr_lock->fh.data, xdr_lock->fh.len);
+ if (xdr_lock->fh.len < LOCKD_FH_HASH_SIZE)
+ memset(lock->fh.data + xdr_lock->fh.len, 0,
+ LOCKD_FH_HASH_SIZE - xdr_lock->fh.len);
lock->oh.len = xdr_lock->oh.len;
lock->oh.data = xdr_lock->oh.data;
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index 58b87ec52930..a0d1a6fbf61e 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -17,7 +17,6 @@
#include <linux/sunrpc/addr.h>
#include <linux/module.h>
#include <linux/mount.h>
-#include <uapi/linux/nfs2.h>
#include "lockd.h"
#include "share.h"
@@ -67,7 +66,7 @@ static inline unsigned int file_hash(struct nfs_fh *f)
{
unsigned int tmp=0;
int i;
- for (i=0; i<NFS2_FHSIZE;i++)
+ for (i = 0; i < LOCKD_FH_HASH_SIZE; i++)
tmp += f->data[i];
return tmp & (FILE_NRHASH - 1);
}
--
2.54.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied
2026-05-14 20:56 ` [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied Chuck Lever
@ 2026-05-14 20:57 ` Chuck Lever
0 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-14 20:57 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
Tom Talpey
Cc: linux-nfs, Jakub Kicinski
On 5/14/26 4:56 PM, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> The sk_err check in tls_rx_rec_wait() consumes the error via
> sock_error(), which clears sk_err atomically. When the caller
> (tls_sw_recvmsg, tls_sw_splice_read, or tls_sw_read_sock) already
> has bytes copied to userspace, it returns those bytes and discards
> the error from this call. sk_err is now zero on the socket, so the
> next read syscall observes only RCV_SHUTDOWN and reports a clean
> EOF instead of the actual error (typically -ECONNRESET).
>
> The race is reachable when tls_read_flush_backlog()'s periodic
> sk_flush_backlog() triggers tcp_reset() in the middle of a
> multi-record read.
>
> Pass a has_copied flag to tls_rx_rec_wait(). When has_copied is
> false, consume sk_err via sock_error() as before. When has_copied
> is true, report the error from READ_ONCE() but leave sk_err set:
> the caller returns the byte count and discards the err from this
> call, and the next read syscall surfaces the preserved sk_err. This
> mirrors the tcp_recvmsg() preserve-and-surface pattern.
>
> The decrypt-abort path is unaffected: tls_err_abort() raises
> sk_err to EBADMSG after tls_rx_rec_wait() returns, and nothing
> on the caller's return path consumes it, so the EBADMSG surfaces
> on the next read.
>
> tls_sw_splice_read() passes has_copied=false: it processes
> one record per call, so no bytes have been copied within the
> function when tls_rx_rec_wait() runs. A reset that arrives
> between iterations of splice_direct_to_actor() (the sendfile()
> path) is still consumed by sock_error() in the later call, and the
> outer loop returns the prior iterations' byte count and drops the
> error. tcp_splice_read() exhibits the same pattern at the iteration
> boundary; addressing it belongs at the splice_direct_to_actor()
> layer and is out of scope here.
>
> Fixes: c46b01839f7a ("tls: rx: periodically flush socket backlog")
> Suggested-by: Jakub Kicinski <kuba@kernel.org>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> net/tls/tls_sw.c | 20 ++++++++++++++------
> 1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 2590e855f6a5..c4cc4e357848 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1356,9 +1356,14 @@ void tls_sw_splice_eof(struct socket *sock)
> mutex_unlock(&tls_ctx->tx_lock);
> }
>
> +/* When has_copied is true the caller has already moved bytes to
> + * userspace. Report sk_err but leave it set so the next read
> + * surfaces it instead of a spurious EOF, otherwise sk_err is
> + * consumed via sock_error().
> + */
> static int
> tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
> - bool released)
> + bool released, bool has_copied)
> {
> struct tls_context *tls_ctx = tls_get_ctx(sk);
> struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
> @@ -1376,8 +1381,11 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
> if (!sk_psock_queue_empty(psock))
> return 0;
>
> - if (sk->sk_err)
> + if (sk->sk_err) {
> + if (has_copied)
> + return -READ_ONCE(sk->sk_err);
> return sock_error(sk);
> + }
>
> if (ret < 0)
> return ret;
> @@ -1413,7 +1421,7 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock,
> }
>
> if (unlikely(!tls_strp_msg_load(&ctx->strp, released)))
> - return tls_rx_rec_wait(sk, psock, nonblock, false);
> + return tls_rx_rec_wait(sk, psock, nonblock, false, has_copied);
>
> return 1;
> }
> @@ -2100,7 +2108,7 @@ int tls_sw_recvmsg(struct sock *sk,
> int to_decrypt, chunk;
>
> err = tls_rx_rec_wait(sk, psock, flags & MSG_DONTWAIT,
> - released);
> + released, !!(decrypted + copied));
> if (err <= 0) {
> if (psock) {
> chunk = sk_msg_recvmsg(sk, psock, msg, len,
> @@ -2287,7 +2295,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
> struct tls_decrypt_arg darg;
>
> err = tls_rx_rec_wait(sk, NULL, flags & SPLICE_F_NONBLOCK,
> - true);
> + true, false);
> if (err <= 0)
> goto splice_read_end;
>
> @@ -2373,7 +2381,7 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
> } else {
> struct tls_decrypt_arg darg;
>
> - err = tls_rx_rec_wait(sk, NULL, true, released);
> + err = tls_rx_rec_wait(sk, NULL, true, released, !!copied);
> if (err <= 0)
> goto read_sock_end;
>
Ignore this email. It looks like something left over from a previous
posting.
--
Chuck Lever
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied
2026-05-13 12:58 [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied Chuck Lever
@ 2026-05-15 1:30 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-15 1:30 UTC (permalink / raw)
To: Chuck Lever
Cc: john.fastabend, kuba, sd, davem, edumazet, pabeni, horms, netdev,
chuck.lever
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 13 May 2026 08:58:25 -0400 you wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> The sk_err check in tls_rx_rec_wait() consumes the error via
> sock_error(), which clears sk_err atomically. When the caller
> (tls_sw_recvmsg, tls_sw_splice_read, or tls_sw_read_sock) already
> has bytes copied to userspace, it returns those bytes and discards
> the error from this call. sk_err is now zero on the socket, so the
> next read syscall observes only RCV_SHUTDOWN and reports a clean
> EOF instead of the actual error (typically -ECONNRESET).
>
> [...]
Here is the summary with links:
- [net] tls: Preserve sk_err across recvmsg() when data has been copied
https://git.kernel.org/netdev/net/c/f508262ae9f2
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-05-15 1:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14 20:56 [PATCH 0/3] three lockd fixes Chuck Lever
2026-05-14 20:56 ` [PATCH 1/3] lockd: Plug nlm_file leak when nlm_do_fopen() fails Chuck Lever
2026-05-14 20:56 ` [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied Chuck Lever
2026-05-14 20:57 ` Chuck Lever
2026-05-14 20:56 ` [PATCH 2/3] lockd: Plug nlm_file refcount leak on cached nlm_do_fopen() failure Chuck Lever
2026-05-14 20:56 ` [PATCH 3/3] lockd: Avoid hashing uninitialized bytes in nlm4svc_lookup_file() Chuck Lever
-- strict thread matches above, loose matches on Subject: below --
2026-05-13 12:58 [PATCH net] tls: Preserve sk_err across recvmsg() when data has been copied Chuck Lever
2026-05-15 1:30 ` patchwork-bot+netdevbpf
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.