* [PATCH] net/handshake: a handshake can only be cancelled once
@ 2025-12-06 14:30 Scott Mayhew
2025-12-06 15:12 ` Chuck Lever
0 siblings, 1 reply; 3+ messages in thread
From: Scott Mayhew @ 2025-12-06 14:30 UTC (permalink / raw)
To: chuck.lever; +Cc: kernel-tls-handshake, netdev
When a handshake request is cancelled it is removed from the
handshake_net->hn_requests list, but it is still present in the
handshake_rhashtbl until it is destroyed.
If a second cancellation request arrives for the same handshake request,
then remove_pending() will return false... and assuming
HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
processing through the out_true label, where we put another reference on
the sock and a refcount underflow occurs.
This can happen for example if a handshake times out - particularly if
the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
follow it up with the ClientHello due to a problem with tlshd. When the
timeout is hit on the server, the server will send a FIN, which triggers
a cancellation request via xs_reset_transport(). When the timeout is
hit on the client, another cancellation request happens via
xs_tls_handshake_sync().
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
net/handshake/request.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/handshake/request.c b/net/handshake/request.c
index 274d2c89b6b2..c7b20d167a55 100644
--- a/net/handshake/request.c
+++ b/net/handshake/request.c
@@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
return false;
}
+ /* Duplicate cancellation request */
+ trace_handshake_cancel_none(net, req, sk);
+ return false;
+
out_true:
trace_handshake_cancel(net, req, sk);
--
2.51.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] net/handshake: a handshake can only be cancelled once 2025-12-06 14:30 [PATCH] net/handshake: a handshake can only be cancelled once Scott Mayhew @ 2025-12-06 15:12 ` Chuck Lever 2025-12-09 19:27 ` Scott Mayhew 0 siblings, 1 reply; 3+ messages in thread From: Chuck Lever @ 2025-12-06 15:12 UTC (permalink / raw) To: Scott Mayhew, Chuck Lever; +Cc: kernel-tls-handshake, netdev On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote: > When a handshake request is cancelled it is removed from the > handshake_net->hn_requests list, but it is still present in the > handshake_rhashtbl until it is destroyed. > > If a second cancellation request arrives for the same handshake request, > then remove_pending() will return false... and assuming > HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue > processing through the out_true label, where we put another reference on > the sock and a refcount underflow occurs. > > This can happen for example if a handshake times out - particularly if > the SUNRPC client sends the AUTH_TLS probe to the server but doesn't > follow it up with the ClientHello due to a problem with tlshd. When the > timeout is hit on the server, the server will send a FIN, which triggers > a cancellation request via xs_reset_transport(). When the timeout is > hit on the client, another cancellation request happens via > xs_tls_handshake_sync(). > > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for > handling handshake requests") > Signed-off-by: Scott Mayhew <smayhew@redhat.com> > --- > net/handshake/request.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/net/handshake/request.c b/net/handshake/request.c > index 274d2c89b6b2..c7b20d167a55 100644 > --- a/net/handshake/request.c > +++ b/net/handshake/request.c > @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk) > return false; > } > > + /* Duplicate cancellation request */ > + trace_handshake_cancel_none(net, req, sk); > + return false; > + > out_true: > trace_handshake_cancel(net, req, sk); > > -- > 2.51.0 To help support engineers find this patch, I recommend using "net/handshake: duplicate handshake cancellations leak socket" as the short description. The proposed solution might introduce a socket reference leak: 1. Request submitted: sock_hold() called (line 271) 2. Request accepted by daemon via handshake_req_next() (removes from pending list) 3. Cancel called: - remove_pending() returns FALSE (not in pending list) - test_and_set_bit() returns FALSE (sets the bit now) - With patch: returns FALSE, sock_put() NOT called 4. handshake_complete() called: bit already set, skips sock_put() What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the pending cancel path so duplicate cancels can be detected? Instead of: if (hn && remove_pending(hn, req)) { /* Request hadn't been accepted */ goto out_true; } go with this bit of untested code: if (hn && remove_pending(hn, req)) { /* Request hadn't been accepted - mark cancelled */ if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) { trace_handshake_cancel_busy(net, req, sk); return false; } goto out_true; } -- Chuck Lever ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] net/handshake: a handshake can only be cancelled once 2025-12-06 15:12 ` Chuck Lever @ 2025-12-09 19:27 ` Scott Mayhew 0 siblings, 0 replies; 3+ messages in thread From: Scott Mayhew @ 2025-12-09 19:27 UTC (permalink / raw) To: Chuck Lever; +Cc: Chuck Lever, kernel-tls-handshake, netdev On Sat, 06 Dec 2025, Chuck Lever wrote: > > > On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote: > > When a handshake request is cancelled it is removed from the > > handshake_net->hn_requests list, but it is still present in the > > handshake_rhashtbl until it is destroyed. > > > > If a second cancellation request arrives for the same handshake request, > > then remove_pending() will return false... and assuming > > HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue > > processing through the out_true label, where we put another reference on > > the sock and a refcount underflow occurs. > > > > This can happen for example if a handshake times out - particularly if > > the SUNRPC client sends the AUTH_TLS probe to the server but doesn't > > follow it up with the ClientHello due to a problem with tlshd. When the > > timeout is hit on the server, the server will send a FIN, which triggers > > a cancellation request via xs_reset_transport(). When the timeout is > > hit on the client, another cancellation request happens via > > xs_tls_handshake_sync(). > > > > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for > > handling handshake requests") > > Signed-off-by: Scott Mayhew <smayhew@redhat.com> > > --- > > net/handshake/request.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/net/handshake/request.c b/net/handshake/request.c > > index 274d2c89b6b2..c7b20d167a55 100644 > > --- a/net/handshake/request.c > > +++ b/net/handshake/request.c > > @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk) > > return false; > > } > > > > + /* Duplicate cancellation request */ > > + trace_handshake_cancel_none(net, req, sk); > > + return false; > > + > > out_true: > > trace_handshake_cancel(net, req, sk); > > > > -- > > 2.51.0 > > To help support engineers find this patch, I recommend using > "net/handshake: duplicate handshake cancellations leak socket" as > the short description. > > The proposed solution might introduce a socket reference leak: > > 1. Request submitted: sock_hold() called (line 271) > 2. Request accepted by daemon via handshake_req_next() > (removes from pending list) > 3. Cancel called: > - remove_pending() returns FALSE (not in pending list) > - test_and_set_bit() returns FALSE (sets the bit now) > - With patch: returns FALSE, sock_put() NOT called > 4. handshake_complete() called: bit already set, skips sock_put() > > What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the > pending cancel path so duplicate cancels can be detected? > > Instead of: > > if (hn && remove_pending(hn, req)) { > /* Request hadn't been accepted */ > goto out_true; > } > > go with this bit of untested code: > > if (hn && remove_pending(hn, req)) { > /* Request hadn't been accepted - mark cancelled */ > if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) { > trace_handshake_cancel_busy(net, req, sk); > return false; > } > goto out_true; > } Thanks, Chuck. That works. > > -- > Chuck Lever > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-12-09 19:27 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-06 14:30 [PATCH] net/handshake: a handshake can only be cancelled once Scott Mayhew 2025-12-06 15:12 ` Chuck Lever 2025-12-09 19:27 ` Scott Mayhew
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).