* [PATCH] net/handshake: a handshake can only be cancelled once
@ 2025-12-06 14:30 Scott Mayhew
2025-12-06 15:12 ` Chuck Lever
0 siblings, 1 reply; 3+ messages in thread
From: Scott Mayhew @ 2025-12-06 14:30 UTC (permalink / raw)
To: chuck.lever; +Cc: kernel-tls-handshake, netdev
When a handshake request is cancelled it is removed from the
handshake_net->hn_requests list, but it is still present in the
handshake_rhashtbl until it is destroyed.
If a second cancellation request arrives for the same handshake request,
then remove_pending() will return false... and assuming
HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
processing through the out_true label, where we put another reference on
the sock and a refcount underflow occurs.
This can happen for example if a handshake times out - particularly if
the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
follow it up with the ClientHello due to a problem with tlshd. When the
timeout is hit on the server, the server will send a FIN, which triggers
a cancellation request via xs_reset_transport(). When the timeout is
hit on the client, another cancellation request happens via
xs_tls_handshake_sync().
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
net/handshake/request.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/handshake/request.c b/net/handshake/request.c
index 274d2c89b6b2..c7b20d167a55 100644
--- a/net/handshake/request.c
+++ b/net/handshake/request.c
@@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
return false;
}
+ /* Duplicate cancellation request */
+ trace_handshake_cancel_none(net, req, sk);
+ return false;
+
out_true:
trace_handshake_cancel(net, req, sk);
--
2.51.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] net/handshake: a handshake can only be cancelled once
2025-12-06 14:30 [PATCH] net/handshake: a handshake can only be cancelled once Scott Mayhew
@ 2025-12-06 15:12 ` Chuck Lever
2025-12-09 19:27 ` Scott Mayhew
0 siblings, 1 reply; 3+ messages in thread
From: Chuck Lever @ 2025-12-06 15:12 UTC (permalink / raw)
To: Scott Mayhew, Chuck Lever; +Cc: kernel-tls-handshake, netdev
On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote:
> When a handshake request is cancelled it is removed from the
> handshake_net->hn_requests list, but it is still present in the
> handshake_rhashtbl until it is destroyed.
>
> If a second cancellation request arrives for the same handshake request,
> then remove_pending() will return false... and assuming
> HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
> processing through the out_true label, where we put another reference on
> the sock and a refcount underflow occurs.
>
> This can happen for example if a handshake times out - particularly if
> the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
> follow it up with the ClientHello due to a problem with tlshd. When the
> timeout is hit on the server, the server will send a FIN, which triggers
> a cancellation request via xs_reset_transport(). When the timeout is
> hit on the client, another cancellation request happens via
> xs_tls_handshake_sync().
>
> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for
> handling handshake requests")
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
> net/handshake/request.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/handshake/request.c b/net/handshake/request.c
> index 274d2c89b6b2..c7b20d167a55 100644
> --- a/net/handshake/request.c
> +++ b/net/handshake/request.c
> @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
> return false;
> }
>
> + /* Duplicate cancellation request */
> + trace_handshake_cancel_none(net, req, sk);
> + return false;
> +
> out_true:
> trace_handshake_cancel(net, req, sk);
>
> --
> 2.51.0
To help support engineers find this patch, I recommend using
"net/handshake: duplicate handshake cancellations leak socket" as
the short description.
The proposed solution might introduce a socket reference leak:
1. Request submitted: sock_hold() called (line 271)
2. Request accepted by daemon via handshake_req_next()
(removes from pending list)
3. Cancel called:
- remove_pending() returns FALSE (not in pending list)
- test_and_set_bit() returns FALSE (sets the bit now)
- With patch: returns FALSE, sock_put() NOT called
4. handshake_complete() called: bit already set, skips sock_put()
What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the
pending cancel path so duplicate cancels can be detected?
Instead of:
if (hn && remove_pending(hn, req)) {
/* Request hadn't been accepted */
goto out_true;
}
go with this bit of untested code:
if (hn && remove_pending(hn, req)) {
/* Request hadn't been accepted - mark cancelled */
if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
trace_handshake_cancel_busy(net, req, sk);
return false;
}
goto out_true;
}
--
Chuck Lever
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] net/handshake: a handshake can only be cancelled once
2025-12-06 15:12 ` Chuck Lever
@ 2025-12-09 19:27 ` Scott Mayhew
0 siblings, 0 replies; 3+ messages in thread
From: Scott Mayhew @ 2025-12-09 19:27 UTC (permalink / raw)
To: Chuck Lever; +Cc: Chuck Lever, kernel-tls-handshake, netdev
On Sat, 06 Dec 2025, Chuck Lever wrote:
>
>
> On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote:
> > When a handshake request is cancelled it is removed from the
> > handshake_net->hn_requests list, but it is still present in the
> > handshake_rhashtbl until it is destroyed.
> >
> > If a second cancellation request arrives for the same handshake request,
> > then remove_pending() will return false... and assuming
> > HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
> > processing through the out_true label, where we put another reference on
> > the sock and a refcount underflow occurs.
> >
> > This can happen for example if a handshake times out - particularly if
> > the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
> > follow it up with the ClientHello due to a problem with tlshd. When the
> > timeout is hit on the server, the server will send a FIN, which triggers
> > a cancellation request via xs_reset_transport(). When the timeout is
> > hit on the client, another cancellation request happens via
> > xs_tls_handshake_sync().
> >
> > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for
> > handling handshake requests")
> > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > ---
> > net/handshake/request.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/net/handshake/request.c b/net/handshake/request.c
> > index 274d2c89b6b2..c7b20d167a55 100644
> > --- a/net/handshake/request.c
> > +++ b/net/handshake/request.c
> > @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
> > return false;
> > }
> >
> > + /* Duplicate cancellation request */
> > + trace_handshake_cancel_none(net, req, sk);
> > + return false;
> > +
> > out_true:
> > trace_handshake_cancel(net, req, sk);
> >
> > --
> > 2.51.0
>
> To help support engineers find this patch, I recommend using
> "net/handshake: duplicate handshake cancellations leak socket" as
> the short description.
>
> The proposed solution might introduce a socket reference leak:
>
> 1. Request submitted: sock_hold() called (line 271)
> 2. Request accepted by daemon via handshake_req_next()
> (removes from pending list)
> 3. Cancel called:
> - remove_pending() returns FALSE (not in pending list)
> - test_and_set_bit() returns FALSE (sets the bit now)
> - With patch: returns FALSE, sock_put() NOT called
> 4. handshake_complete() called: bit already set, skips sock_put()
>
> What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the
> pending cancel path so duplicate cancels can be detected?
>
> Instead of:
>
> if (hn && remove_pending(hn, req)) {
> /* Request hadn't been accepted */
> goto out_true;
> }
>
> go with this bit of untested code:
>
> if (hn && remove_pending(hn, req)) {
> /* Request hadn't been accepted - mark cancelled */
> if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
> trace_handshake_cancel_busy(net, req, sk);
> return false;
> }
> goto out_true;
> }
Thanks, Chuck. That works.
>
> --
> Chuck Lever
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-12-09 19:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-06 14:30 [PATCH] net/handshake: a handshake can only be cancelled once Scott Mayhew
2025-12-06 15:12 ` Chuck Lever
2025-12-09 19:27 ` Scott Mayhew
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).