Linux CIFS filesystem development
 help / color / mirror / Atom feed
* [PATCH] smb: client: fix first failure in negotiation after server reboot
@ 2025-06-13 10:24 zhangjian (CG)
  2025-06-13 10:44 ` zhangjian (CG)
  0 siblings, 1 reply; 3+ messages in thread
From: zhangjian (CG) @ 2025-06-13 10:24 UTC (permalink / raw)
  To: stfrench, smfrench, longli, wangzhaolong1, metze, dhowells, pc
  Cc: linux-cifs-client, linux-kernel, linux-cifs

after fabc4ed200f9, server_unresponsive add a condition to check whether
 client need to reconnect depending on server->lstrp. When client failed 
to reconnect in 180s, client will abort connection and update server-
>lstrp for the last time. In the following scene, server->lstrp is too 
old, which may cause failure for the first negotiation.

client                                         | server
-----------------------------------------------+-----------
mount to cifs server                           |
ls                                             |
                                               | reboot
    stuck for 180s and return EHOSTDOWN        |
    abort connection and update server->lstrp  |
                                               | service smb restart
ls                                             |
    smb_negotiate                              |
        server_unresponsive is true [in cifsd] |
        cifs_sync_mid_result return EAGAIN     |
    smb_negotiate return EHOSTDOWN             |
ls failed                                      |

we update server->lstrp before last switching into CifsInNegotiate state 
to avoid this failure.

Fixes: fabc4ed200f9 ("smb: client: fix hang in wait_for_response() for 
negproto")
Signed-off-by: zhangjian <zhangjian496@huawei.com>
---
 fs/smb/client/connect.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index 28bc33496..f9aef60f1 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -4193,6 +4193,7 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
 		return 0;
 	}
 
+	server->lstrp = jiffies;
 	server->tcpStatus = CifsInNegotiate;
 	spin_unlock(&server->srv_lock);
 
-- 
2.33.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH] smb: client: fix first failure in negotiation after server reboot
  2025-06-13 10:24 [PATCH] smb: client: fix first failure in negotiation after server reboot zhangjian (CG)
@ 2025-06-13 10:44 ` zhangjian (CG)
       [not found]   ` <CAH2r5mshSVCms8hwJepT25jyYmF-qEKFp3mDdwYG1e7nXfs_2g@mail.gmail.com>
  0 siblings, 1 reply; 3+ messages in thread
From: zhangjian (CG) @ 2025-06-13 10:44 UTC (permalink / raw)
  To: stfrench, smfrench, longli, wangzhaolong1, metze, dhowells, pc
  Cc: linux-kernel, linux-cifs

After fabc4ed200f9, server_unresponsive add a condition to check whether 
client need to reconnect depending on server->lstrp. When client failed 
to reconnect in 180s, client will abort connection and update server->lstrp 
for the last time. In the following scene, server->lstrp is too 
old, which may cause failure for the first negotiation.

client                                                 | server
-------------------------------------------------------+------------------
mount to cifs server                                   |
ls                                                     |
                                                       | reboot
    stuck for 180s and return EHOSTDOWN                |
    abort connection and update server->lstrp          |
                                                       | sleep 21s
                                                       | service smb restart
ls                                                     |
    smb_negotiate                                      |
        server_unresponsive cause reconnect [in cifsd] |
        ( tcpStatus == CifsInNegotiate &&              |
	            jiffies > server->lstrp + 20s )        |
        cifs_sync_mid_result return EAGAIN             |
    smb_negotiate return EHOSTDOWN                     |
ls failed                                              |

The condition (tcpStatus == CifsInNegotiate && jiffies > server->lstrp + 20s)
expect client stay in CifsInNegotiate state for more than 20s. So we update 
server->lstrp before last switching into CifsInNegotiate state to avoid 
this failure.

Fixes: fabc4ed200f9 ("smb: client: fix hang in wait_for_response() for 
negproto")
Signed-off-by: zhangjian <zhangjian496@huawei.com>
---
 fs/smb/client/connect.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index 28bc33496..f9aef60f1 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -4193,6 +4193,7 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
 		return 0;
 	}
 
+	server->lstrp = jiffies;
 	server->tcpStatus = CifsInNegotiate;
 	spin_unlock(&server->srv_lock);
 
-- 
2.33.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [EXTERNAL] Re: [PATCH] smb: client: fix first failure in negotiation after server reboot
       [not found]       ` <186d442d-69db-4a52-b65b-f67370547c45@huawei.com>
@ 2025-06-16  2:15         ` zhangjian (CG)
  0 siblings, 0 replies; 3+ messages in thread
From: zhangjian (CG) @ 2025-06-16  2:15 UTC (permalink / raw)
  To: Shyam Prasad (Azure Files); +Cc: linux-cifs, linux-kernel

In addition, If negotiation received no response, there are two possible
actions:
1. server_unresponsive may trigger reconnecting again and return true.
2. server_unresponsive may return false and client falls back to
CifsNeedNegotiate state and trigger reconnecting in SMB2_echo.

There two conditions are similar to the stage when first mounting to
cifs server.

On 2025/6/16 10:01, zhangjian (CG) wrote:
> 
> 
> 
> 
> On 2025/6/15 21:08, Shyam Prasad (Azure Files) wrote:
>> Can we have a situation where we just got the sock_recvmsg just timed out, and before we loop back to server_unresponsive, if another parallel negotiate updates lstrp?
> 
> Negotiation only comes when connection is touchable. Client will send a
> negotiation message to server. If we just got the sock_recvmsg timeout
> and loop back to server_unresponsive, it will return false. Client calls
> sock_recvmsg again and wait for negotiation response. Everything is Ok.
> 
>> That will cause us to not detect the server unresponsive situation, even if that did happen.
>> server->lstrp is meant to store the last "response" time from the server.
> 
> server->lstrp is also updated during setting up and aborting connection
> even when there is no response. These can be regarded as initial value
> for server->lstrp.
> I think server->lstrp needs an initial value before negotiation rather
> than connection.
> 
>>
>>
>> Regards,
>>
>> Shyam
>>
>>
>>
>>
>> ________________________________
>> From: Steve French <smfrench@gmail.com>
>> Sent: Friday, June 13, 2025 20:53
>> To: zhangjian (CG) <zhangjian496@huawei.com>
>> Cc: Shyam Prasad (Azure Files) <Shyam.Prasad@microsoft.com>; Paulo Alcantara <pc@manguebit.com>
>> Subject: [EXTERNAL] Re: [PATCH] smb: client: fix first failure in negotiation after server reboot
>>
>> Could you clarify the reproduction scenario? It was a little hard to read
>>
>> On Fri, Jun 13, 2025 at 5:44 AM zhangjian (CG) <zhangjian496@huawei.com> wrote:
>>>
>>> After fabc4ed200f9, server_unresponsive add a condition to check whether
>>> client need to reconnect depending on server->lstrp. When client failed
>>> to reconnect in 180s, client will abort connection and update server->lstrp
>>> for the last time. In the following scene, server->lstrp is too
>>> old, which may cause failure for the first negotiation.
>>>
>>> client                                                 | server
>>> -------------------------------------------------------+------------------
>>> mount to cifs server                                   |
>>> ls                                                     |
>>>                                                        | reboot
>>>     stuck for 180s and return EHOSTDOWN                |
>>>     abort connection and update server->lstrp          |
>>>                                                        | sleep 21s
>>>                                                        | service smb restart
>>> ls                                                     |
>>>     smb_negotiate                                      |
>>>         server_unresponsive cause reconnect [in cifsd] |
>>>         ( tcpStatus == CifsInNegotiate &&              |
>>>                     jiffies > server->lstrp + 20s )        |
>>>         cifs_sync_mid_result return EAGAIN             |
>>>     smb_negotiate return EHOSTDOWN                     |
>>> ls failed                                              |
>>>
>>> The condition (tcpStatus == CifsInNegotiate && jiffies > server->lstrp + 20s)
>>> expect client stay in CifsInNegotiate state for more than 20s. So we update
>>> server->lstrp before last switching into CifsInNegotiate state to avoid
>>> this failure.
>>>
>>> Fixes: fabc4ed200f9 ("smb: client: fix hang in wait_for_response() for
>>> negproto")
>>> Signed-off-by: zhangjian <zhangjian496@huawei.com>
>>> ---
>>>  fs/smb/client/connect.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
>>> index 28bc33496..f9aef60f1 100644
>>> --- a/fs/smb/client/connect.c
>>> +++ b/fs/smb/client/connect.c
>>> @@ -4193,6 +4193,7 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
>>>                 return 0;
>>>         }
>>>
>>> +       server->lstrp = jiffies;
>>>         server->tcpStatus = CifsInNegotiate;
>>>         spin_unlock(&server->srv_lock);
>>>
>>> --
>>> 2.33.0
>>
>>
>>
>> --
>> Thanks,
>>
>> Steve
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-06-16  2:15 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-13 10:24 [PATCH] smb: client: fix first failure in negotiation after server reboot zhangjian (CG)
2025-06-13 10:44 ` zhangjian (CG)
     [not found]   ` <CAH2r5mshSVCms8hwJepT25jyYmF-qEKFp3mDdwYG1e7nXfs_2g@mail.gmail.com>
     [not found]     ` <TYPP153MB14907291155DA3320C8F6A909471A@TYPP153MB1490.APCP153.PROD.OUTLOOK.COM>
     [not found]       ` <186d442d-69db-4a52-b65b-f67370547c45@huawei.com>
2025-06-16  2:15         ` [EXTERNAL] " zhangjian (CG)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox