netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net/handshake: a handshake can only be cancelled once
@ 2025-12-06 14:30 Scott Mayhew
  2025-12-06 15:12 ` Chuck Lever
  0 siblings, 1 reply; 3+ messages in thread
From: Scott Mayhew @ 2025-12-06 14:30 UTC (permalink / raw)
  To: chuck.lever; +Cc: kernel-tls-handshake, netdev

When a handshake request is cancelled it is removed from the
handshake_net->hn_requests list, but it is still present in the
handshake_rhashtbl until it is destroyed.

If a second cancellation request arrives for the same handshake request,
then remove_pending() will return false... and assuming
HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
processing through the out_true label, where we put another reference on
the sock and a refcount underflow occurs.

This can happen for example if a handshake times out - particularly if
the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
follow it up with the ClientHello due to a problem with tlshd.  When the
timeout is hit on the server, the server will send a FIN, which triggers
a cancellation request via xs_reset_transport().  When the timeout is
hit on the client, another cancellation request happens via
xs_tls_handshake_sync().

Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 net/handshake/request.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/handshake/request.c b/net/handshake/request.c
index 274d2c89b6b2..c7b20d167a55 100644
--- a/net/handshake/request.c
+++ b/net/handshake/request.c
@@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
 		return false;
 	}
 
+	/* Duplicate cancellation request */
+	trace_handshake_cancel_none(net, req, sk);
+	return false;
+
 out_true:
 	trace_handshake_cancel(net, req, sk);
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] net/handshake: a handshake can only be cancelled once
  2025-12-06 14:30 [PATCH] net/handshake: a handshake can only be cancelled once Scott Mayhew
@ 2025-12-06 15:12 ` Chuck Lever
  2025-12-09 19:27   ` Scott Mayhew
  0 siblings, 1 reply; 3+ messages in thread
From: Chuck Lever @ 2025-12-06 15:12 UTC (permalink / raw)
  To: Scott Mayhew, Chuck Lever; +Cc: kernel-tls-handshake, netdev



On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote:
> When a handshake request is cancelled it is removed from the
> handshake_net->hn_requests list, but it is still present in the
> handshake_rhashtbl until it is destroyed.
>
> If a second cancellation request arrives for the same handshake request,
> then remove_pending() will return false... and assuming
> HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
> processing through the out_true label, where we put another reference on
> the sock and a refcount underflow occurs.
>
> This can happen for example if a handshake times out - particularly if
> the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
> follow it up with the ClientHello due to a problem with tlshd.  When the
> timeout is hit on the server, the server will send a FIN, which triggers
> a cancellation request via xs_reset_transport().  When the timeout is
> hit on the client, another cancellation request happens via
> xs_tls_handshake_sync().
>
> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for 
> handling handshake requests")
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
>  net/handshake/request.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/handshake/request.c b/net/handshake/request.c
> index 274d2c89b6b2..c7b20d167a55 100644
> --- a/net/handshake/request.c
> +++ b/net/handshake/request.c
> @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
>  		return false;
>  	}
> 
> +	/* Duplicate cancellation request */
> +	trace_handshake_cancel_none(net, req, sk);
> +	return false;
> +
>  out_true:
>  	trace_handshake_cancel(net, req, sk);
> 
> -- 
> 2.51.0

To help support engineers find this patch, I recommend using
"net/handshake: duplicate handshake cancellations leak socket" as
the short description.

The proposed solution might introduce a socket reference leak:

1. Request submitted: sock_hold() called (line 271)
2. Request accepted by daemon via handshake_req_next()
   (removes from pending list)
3. Cancel called:
  - remove_pending() returns FALSE (not in pending list)
  - test_and_set_bit() returns FALSE (sets the bit now)
  - With patch: returns FALSE, sock_put() NOT called
4. handshake_complete() called: bit already set, skips sock_put()

What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the
pending cancel path so duplicate cancels can be detected?

Instead of:

        if (hn && remove_pending(hn, req)) {
                /* Request hadn't been accepted */
                goto out_true;
        }

go with this bit of untested code:

        if (hn && remove_pending(hn, req)) {
                /* Request hadn't been accepted - mark cancelled */
                if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
                        trace_handshake_cancel_busy(net, req, sk);
                        return false;
                }
                goto out_true;
        }

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] net/handshake: a handshake can only be cancelled once
  2025-12-06 15:12 ` Chuck Lever
@ 2025-12-09 19:27   ` Scott Mayhew
  0 siblings, 0 replies; 3+ messages in thread
From: Scott Mayhew @ 2025-12-09 19:27 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Chuck Lever, kernel-tls-handshake, netdev

On Sat, 06 Dec 2025, Chuck Lever wrote:

> 
> 
> On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote:
> > When a handshake request is cancelled it is removed from the
> > handshake_net->hn_requests list, but it is still present in the
> > handshake_rhashtbl until it is destroyed.
> >
> > If a second cancellation request arrives for the same handshake request,
> > then remove_pending() will return false... and assuming
> > HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
> > processing through the out_true label, where we put another reference on
> > the sock and a refcount underflow occurs.
> >
> > This can happen for example if a handshake times out - particularly if
> > the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
> > follow it up with the ClientHello due to a problem with tlshd.  When the
> > timeout is hit on the server, the server will send a FIN, which triggers
> > a cancellation request via xs_reset_transport().  When the timeout is
> > hit on the client, another cancellation request happens via
> > xs_tls_handshake_sync().
> >
> > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for 
> > handling handshake requests")
> > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > ---
> >  net/handshake/request.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/net/handshake/request.c b/net/handshake/request.c
> > index 274d2c89b6b2..c7b20d167a55 100644
> > --- a/net/handshake/request.c
> > +++ b/net/handshake/request.c
> > @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
> >  		return false;
> >  	}
> > 
> > +	/* Duplicate cancellation request */
> > +	trace_handshake_cancel_none(net, req, sk);
> > +	return false;
> > +
> >  out_true:
> >  	trace_handshake_cancel(net, req, sk);
> > 
> > -- 
> > 2.51.0
> 
> To help support engineers find this patch, I recommend using
> "net/handshake: duplicate handshake cancellations leak socket" as
> the short description.
> 
> The proposed solution might introduce a socket reference leak:
> 
> 1. Request submitted: sock_hold() called (line 271)
> 2. Request accepted by daemon via handshake_req_next()
>    (removes from pending list)
> 3. Cancel called:
>   - remove_pending() returns FALSE (not in pending list)
>   - test_and_set_bit() returns FALSE (sets the bit now)
>   - With patch: returns FALSE, sock_put() NOT called
> 4. handshake_complete() called: bit already set, skips sock_put()
> 
> What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the
> pending cancel path so duplicate cancels can be detected?
> 
> Instead of:
> 
>         if (hn && remove_pending(hn, req)) {
>                 /* Request hadn't been accepted */
>                 goto out_true;
>         }
> 
> go with this bit of untested code:
> 
>         if (hn && remove_pending(hn, req)) {
>                 /* Request hadn't been accepted - mark cancelled */
>                 if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
>                         trace_handshake_cancel_busy(net, req, sk);
>                         return false;
>                 }
>                 goto out_true;
>         }

Thanks, Chuck.  That works.
> 
> -- 
> Chuck Lever
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-12-09 19:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-06 14:30 [PATCH] net/handshake: a handshake can only be cancelled once Scott Mayhew
2025-12-06 15:12 ` Chuck Lever
2025-12-09 19:27   ` Scott Mayhew

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).