Linux NFS development
 help / color / mirror / Atom feed
* race between handshake and socket close
@ 2023-08-28 18:02 Chuck Lever III
  2023-08-28 18:35 ` Trond Myklebust
  0 siblings, 1 reply; 3+ messages in thread
From: Chuck Lever III @ 2023-08-28 18:02 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna Schumaker, Linux NFS Mailing List

Hi -

Anna has identified a window during a handshake where a socket
close will trigger a crash. This could be a common scenario for
a rejected handshake.

          <idle>-0     [003]  5405.466661: rpc_socket_state_change: socket:[59266] srcaddr=192.168.122.166:833 dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=8 (CLOSE_WAIT)
          <idle>-0     [003]  5405.466665: xs_data_ready:        peer=[192.168.122.100]:2049
          <idle>-0     [003]  5405.466668: rpc_socket_state_change: socket:[59266] srcaddr=192.168.122.166:833 dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=7 (CLOSE)
          <idle>-0     [003]  5405.466669: rpc_socket_error:     error=-32 socket:[59266] srcaddr=192.168.122.166:833 dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=7 (CLOSE)
    kworker/u8:2-2367  [001]  5405.466786: xprt_disconnect_force: peer=[192.168.122.100]:2049 state=BOUND
           tlshd-6005  [002]  5405.466786: handshake_cmd_done:   req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0 fd=6
           tlshd-6005  [002]  5405.466786: handshake_complete:   req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0 status=13
    kworker/u8:2-2367  [001]  5405.466789: xprt_disconnect_auto: peer=[192.168.122.100]:2049 state=LOCKED|CLOSE_WAIT|BOUND
    kworker/u8:2-2367  [001]  5405.466790: function:             xs_reset_transport
    kworker/u8:2-2367  [001]  5405.466794: kernel_stack:         <stack trace >
=> ftrace_trampoline (ffffffffc0c3409b)
=> xs_reset_transport (ffffffffc0814225)
=> xs_tcp_shutdown (ffffffffc0816b3e)
=> xprt_autoclose (ffffffffc0811799)
=> process_one_work (ffffffffa6ae2777)
=> worker_thread (ffffffffa6ae2d67)
=> kthread (ffffffffa6ae93e7)
=> ret_from_fork (ffffffffa6a424f7)
=> ret_from_fork_asm (ffffffffa6a03aeb)
    kworker/u8:2-2367  [001]  5405.466795: handshake_cancel_busy: req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0
    kworker/u8:2-2367  [001]  5405.466795: rpc_socket_state_change: socket:[59266] srcaddr=192.168.122.166:833 dstaddr=192.168.122.100:2049 state=4 (DISCONNECTING) sk_state=7 (CLOSE)
    kworker/u8:2-2367  [001]  5405.466797: rpc_socket_close:     socket:[59266] srcaddr=192.168.122.166:833 dstaddr=192.168.122.100:2049 state=4 (DISCONNECTING) sk_state=7 (CLOSE)
    kworker/u8:2-2367  [001]  5405.466797: xprt_disconnect_done: peer=[192.168.122.100]:2049 state=LOCKED|BOUND
    kworker/u8:2-2367  [001]  5405.466798: xprt_release_xprt:    task:ffffffff@ffffffff snd_task:ffffffff
    kworker/u8:7-2407  [000]  5405.466804: bprint:               xs_tcp_tls_finish_connecting: xs_tcp_tls_finish_connecting(handshake status=0): sock is NULL!!

Do you have any suggestions about how to safely prevent ->close
from firing on the lower transport during a TLS handshake?


--
Chuck Lever



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: race between handshake and socket close
  2023-08-28 18:02 race between handshake and socket close Chuck Lever III
@ 2023-08-28 18:35 ` Trond Myklebust
  2023-08-28 19:46   ` Chuck Lever III
  0 siblings, 1 reply; 3+ messages in thread
From: Trond Myklebust @ 2023-08-28 18:35 UTC (permalink / raw)
  To: chuck.lever@oracle.com; +Cc: anna@kernel.org, linux-nfs@vger.kernel.org

On Mon, 2023-08-28 at 18:02 +0000, Chuck Lever III wrote:
> Hi -
> 
> Anna has identified a window during a handshake where a socket
> close will trigger a crash. This could be a common scenario for
> a rejected handshake.
> 
>           <idle>-0     [003]  5405.466661: rpc_socket_state_change:
> socket:[59266] srcaddr=192.168.122.166:833
> dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=8
> (CLOSE_WAIT)
>           <idle>-0     [003]  5405.466665: xs_data_ready:       
> peer=[192.168.122.100]:2049
>           <idle>-0     [003]  5405.466668: rpc_socket_state_change:
> socket:[59266] srcaddr=192.168.122.166:833
> dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=7 (CLOSE)
>           <idle>-0     [003]  5405.466669: rpc_socket_error:    
> error=-32 socket:[59266] srcaddr=192.168.122.166:833
> dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=7 (CLOSE)
>     kworker/u8:2-2367  [001]  5405.466786: xprt_disconnect_force:
> peer=[192.168.122.100]:2049 state=BOUND
>            tlshd-6005  [002]  5405.466786: handshake_cmd_done:  
> req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0 fd=6
>            tlshd-6005  [002]  5405.466786: handshake_complete:  
> req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0 status=13
>     kworker/u8:2-2367  [001]  5405.466789: xprt_disconnect_auto:
> peer=[192.168.122.100]:2049 state=LOCKED|CLOSE_WAIT|BOUND
>     kworker/u8:2-2367  [001]  5405.466790: function:            
> xs_reset_transport
>     kworker/u8:2-2367  [001]  5405.466794: kernel_stack:        
> <stack trace >
> => ftrace_trampoline (ffffffffc0c3409b)
> => xs_reset_transport (ffffffffc0814225)
> => xs_tcp_shutdown (ffffffffc0816b3e)
> => xprt_autoclose (ffffffffc0811799)
> => process_one_work (ffffffffa6ae2777)
> => worker_thread (ffffffffa6ae2d67)
> => kthread (ffffffffa6ae93e7)
> => ret_from_fork (ffffffffa6a424f7)
> => ret_from_fork_asm (ffffffffa6a03aeb)
>     kworker/u8:2-2367  [001]  5405.466795: handshake_cancel_busy:
> req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0
>     kworker/u8:2-2367  [001]  5405.466795: rpc_socket_state_change:
> socket:[59266] srcaddr=192.168.122.166:833
> dstaddr=192.168.122.100:2049 state=4 (DISCONNECTING) sk_state=7
> (CLOSE)
>     kworker/u8:2-2367  [001]  5405.466797: rpc_socket_close:    
> socket:[59266] srcaddr=192.168.122.166:833
> dstaddr=192.168.122.100:2049 state=4 (DISCONNECTING) sk_state=7
> (CLOSE)
>     kworker/u8:2-2367  [001]  5405.466797: xprt_disconnect_done:
> peer=[192.168.122.100]:2049 state=LOCKED|BOUND
>     kworker/u8:2-2367  [001]  5405.466798: xprt_release_xprt:   
> task:ffffffff@ffffffff snd_task:ffffffff
>     kworker/u8:7-2407  [000]  5405.466804: bprint:              
> xs_tcp_tls_finish_connecting: xs_tcp_tls_finish_connecting(handshake
> status=0): sock is NULL!!
> 
> Do you have any suggestions about how to safely prevent ->close
> from firing on the lower transport during a TLS handshake?
> 

Shouldn't something be holding a reference to the upper transport
across that call to setup the lower transport connection?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: race between handshake and socket close
  2023-08-28 18:35 ` Trond Myklebust
@ 2023-08-28 19:46   ` Chuck Lever III
  0 siblings, 0 replies; 3+ messages in thread
From: Chuck Lever III @ 2023-08-28 19:46 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: anna@kernel.org, Linux NFS Mailing List



> On Aug 28, 2023, at 2:35 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Mon, 2023-08-28 at 18:02 +0000, Chuck Lever III wrote:
>> Hi -
>> 
>> Anna has identified a window during a handshake where a socket
>> close will trigger a crash. This could be a common scenario for
>> a rejected handshake.
>> 
>>           <idle>-0     [003]  5405.466661: rpc_socket_state_change:
>> socket:[59266] srcaddr=192.168.122.166:833
>> dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=8
>> (CLOSE_WAIT)
>>           <idle>-0     [003]  5405.466665: xs_data_ready:       
>> peer=[192.168.122.100]:2049
>>           <idle>-0     [003]  5405.466668: rpc_socket_state_change:
>> socket:[59266] srcaddr=192.168.122.166:833
>> dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=7 (CLOSE)
>>           <idle>-0     [003]  5405.466669: rpc_socket_error:    
>> error=-32 socket:[59266] srcaddr=192.168.122.166:833
>> dstaddr=192.168.122.100:2049 state=2 (CONNECTING) sk_state=7 (CLOSE)
>>     kworker/u8:2-2367  [001]  5405.466786: xprt_disconnect_force:
>> peer=[192.168.122.100]:2049 state=BOUND
>>            tlshd-6005  [002]  5405.466786: handshake_cmd_done:  
>> req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0 fd=6
>>            tlshd-6005  [002]  5405.466786: handshake_complete:  
>> req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0 status=13
>>     kworker/u8:2-2367  [001]  5405.466789: xprt_disconnect_auto:
>> peer=[192.168.122.100]:2049 state=LOCKED|CLOSE_WAIT|BOUND
>>     kworker/u8:2-2367  [001]  5405.466790: function:            
>> xs_reset_transport
>>     kworker/u8:2-2367  [001]  5405.466794: kernel_stack:        
>> <stack trace >
>> => ftrace_trampoline (ffffffffc0c3409b)
>> => xs_reset_transport (ffffffffc0814225)
>> => xs_tcp_shutdown (ffffffffc0816b3e)
>> => xprt_autoclose (ffffffffc0811799)
>> => process_one_work (ffffffffa6ae2777)
>> => worker_thread (ffffffffa6ae2d67)
>> => kthread (ffffffffa6ae93e7)
>> => ret_from_fork (ffffffffa6a424f7)
>> => ret_from_fork_asm (ffffffffa6a03aeb)
>>     kworker/u8:2-2367  [001]  5405.466795: handshake_cancel_busy:
>> req=0xffff9e8a43c7d000 sk=0xffff9e8a42e2f1c0
>>     kworker/u8:2-2367  [001]  5405.466795: rpc_socket_state_change:
>> socket:[59266] srcaddr=192.168.122.166:833
>> dstaddr=192.168.122.100:2049 state=4 (DISCONNECTING) sk_state=7
>> (CLOSE)
>>     kworker/u8:2-2367  [001]  5405.466797: rpc_socket_close:    
>> socket:[59266] srcaddr=192.168.122.166:833
>> dstaddr=192.168.122.100:2049 state=4 (DISCONNECTING) sk_state=7
>> (CLOSE)
>>     kworker/u8:2-2367  [001]  5405.466797: xprt_disconnect_done:
>> peer=[192.168.122.100]:2049 state=LOCKED|BOUND
>>     kworker/u8:2-2367  [001]  5405.466798: xprt_release_xprt:   
>> task:ffffffff@ffffffff snd_task:ffffffff
>>     kworker/u8:7-2407  [000]  5405.466804: bprint:              
>> xs_tcp_tls_finish_connecting: xs_tcp_tls_finish_connecting(handshake
>> status=0): sock is NULL!!
>> 
>> Do you have any suggestions about how to safely prevent ->close
>> from firing on the lower transport during a TLS handshake?
>> 
> 
> Shouldn't something be holding a reference to the upper transport
> across that call to setup the lower transport connection?

I didn't see other transports holding an extra reference. But it
can be added.

However, I don't understand how that would prevent the socket
from being closed and released while the lower transport still
owns it, which is what is happening here.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-08-28 19:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-28 18:02 race between handshake and socket close Chuck Lever III
2023-08-28 18:35 ` Trond Myklebust
2023-08-28 19:46   ` Chuck Lever III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox