* [PATCH 2/6] cifs: reset connections for all channels when reconnect requested
2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
@ 2025-06-02 17:07 ` nspmangalore
2025-06-02 17:07 ` [PATCH 3/6] cifs: update dstaddr whenever channel iface is updated nspmangalore
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: nspmangalore @ 2025-06-02 17:07 UTC (permalink / raw)
To: linux-cifs, smfrench, bharathsm.hsk, meetakshisetiyaoss, pc
Cc: Shyam Prasad N, stable
From: Shyam Prasad N <sprasad@microsoft.com>
cifs_reconnect can be called with a flag to mark the session as needing
reconnect too. When this is done, we expect the connections of all
channels to be reconnected too, which is not happening today.
Without doing this, we have seen bad things happen when primary and
secondary channels are connected to different servers (in case of cloud
services like Azure Files SMB).
This change would force all connections to reconnect as well, not just
the sessions and tcons.
Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/smb/client/connect.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index 6bf04d9a5491..ca1cb01c6ef8 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -377,6 +377,13 @@ static int __cifs_reconnect(struct TCP_Server_Info *server,
if (!cifs_tcp_ses_needs_reconnect(server, 1))
return 0;
+ /*
+ * if smb session has been marked for reconnect, also reconnect all
+ * connections. This way, the other connections do not end up bad.
+ */
+ if (mark_smb_session)
+ cifs_signal_cifsd_for_reconnect(server, mark_smb_session);
+
cifs_mark_tcp_ses_conns_for_reconnect(server, mark_smb_session);
cifs_abort_connection(server);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 3/6] cifs: update dstaddr whenever channel iface is updated
2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
2025-06-02 17:07 ` [PATCH 2/6] cifs: reset connections for all channels when reconnect requested nspmangalore
@ 2025-06-02 17:07 ` nspmangalore
2025-06-02 17:07 ` [PATCH 4/6] cifs: serialize other channels when query server interfaces is pending nspmangalore
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: nspmangalore @ 2025-06-02 17:07 UTC (permalink / raw)
To: linux-cifs, smfrench, bharathsm.hsk, meetakshisetiyaoss, pc
Cc: Shyam Prasad N, stable
From: Shyam Prasad N <sprasad@microsoft.com>
When the server interface info changes (more common in clustered
servers like Azure Files), the per-channel iface gets updated.
However, this did not update the corresponding dstaddr. As a result
these channels will still connect (or try connecting) to older addresses.
Fixes: b54034a73baf ("cifs: during reconnect, update interface if necessary")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/smb/client/sess.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/smb/client/sess.c b/fs/smb/client/sess.c
index b3fa9ee26912..8add3ba14e9f 100644
--- a/fs/smb/client/sess.c
+++ b/fs/smb/client/sess.c
@@ -445,6 +445,10 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
ses->chans[chan_index].iface = iface;
spin_unlock(&ses->chan_lock);
+
+ spin_lock(&server->srv_lock);
+ memcpy(&server->dstaddr, &iface->sockaddr, sizeof(server->dstaddr));
+ spin_unlock(&server->srv_lock);
}
static int
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 4/6] cifs: serialize other channels when query server interfaces is pending
2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
2025-06-02 17:07 ` [PATCH 2/6] cifs: reset connections for all channels when reconnect requested nspmangalore
2025-06-02 17:07 ` [PATCH 3/6] cifs: update dstaddr whenever channel iface is updated nspmangalore
@ 2025-06-02 17:07 ` nspmangalore
2025-06-02 17:07 ` [PATCH 5/6] cifs: dns resolution is needed only for primary channel nspmangalore
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: nspmangalore @ 2025-06-02 17:07 UTC (permalink / raw)
To: linux-cifs, smfrench, bharathsm.hsk, meetakshisetiyaoss, pc
Cc: Shyam Prasad N
From: Shyam Prasad N <sprasad@microsoft.com>
Today, during smb2_reconnect, session_mutex is released as soon as
the tcon is reconnected and is in a good state. However, in case
multichannel is enabled, there is also a query of server interfaces that
follows. We've seen that this query can race with reconnects of other
channels, causing them to step on each other with reconnects.
This change extends the hold of session_mutex till after the query of
server interfaces is complete. In order to avoid recursive smb2_reconnect
checks during query ioctl, this change also introduces a session flag
for sessions where such a query is in progress.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/smb/client/cifsglob.h | 1 +
fs/smb/client/smb2pdu.c | 24 ++++++++++++++++++------
2 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
index 3b32116b0b49..0c80ca352f3f 100644
--- a/fs/smb/client/cifsglob.h
+++ b/fs/smb/client/cifsglob.h
@@ -1084,6 +1084,7 @@ struct cifs_chan {
};
#define CIFS_SES_FLAG_SCALE_CHANNELS (0x1)
+#define CIFS_SES_FLAGS_PENDING_QUERY_INTERFACES (0x2)
/*
* Session structure. One of these for each uid session with a particular host
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 4e28632b5fd6..59a6b86c3786 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -411,14 +411,19 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
if (!rc &&
(server->capabilities & SMB2_GLOBAL_CAP_MULTI_CHANNEL) &&
server->ops->query_server_interfaces) {
- mutex_unlock(&ses->session_mutex);
-
/*
- * query server network interfaces, in case they change
+ * query server network interfaces, in case they change.
+ * Also mark the session as pending this update while the query
+ * is in progress. This will be used to avoid calling
+ * smb2_reconnect recursively.
*/
+ ses->flags |= CIFS_SES_FLAGS_PENDING_QUERY_INTERFACES;
xid = get_xid();
rc = server->ops->query_server_interfaces(xid, tcon, false);
free_xid(xid);
+ ses->flags &= ~CIFS_SES_FLAGS_PENDING_QUERY_INTERFACES;
+
+ mutex_unlock(&ses->session_mutex);
if (rc == -EOPNOTSUPP && ses->chan_count > 1) {
/*
@@ -560,11 +565,18 @@ static int smb2_ioctl_req_init(u32 opcode, struct cifs_tcon *tcon,
struct TCP_Server_Info *server,
void **request_buf, unsigned int *total_len)
{
- /* Skip reconnect only for FSCTL_VALIDATE_NEGOTIATE_INFO IOCTLs */
- if (opcode == FSCTL_VALIDATE_NEGOTIATE_INFO) {
+ /*
+ * Skip reconnect in one of the following cases:
+ * 1. For FSCTL_VALIDATE_NEGOTIATE_INFO IOCTLs
+ * 2. For FSCTL_QUERY_NETWORK_INTERFACE_INFO IOCTL when called from
+ * smb2_reconnect (indicated by CIFS_SES_FLAG_SCALE_CHANNELS ses flag)
+ */
+ if (opcode == FSCTL_VALIDATE_NEGOTIATE_INFO ||
+ (opcode == FSCTL_QUERY_NETWORK_INTERFACE_INFO &&
+ (tcon->ses->flags & CIFS_SES_FLAGS_PENDING_QUERY_INTERFACES)))
return __smb2_plain_req_init(SMB2_IOCTL, tcon, server,
request_buf, total_len);
- }
+
return smb2_plain_req_init(SMB2_IOCTL, tcon, server,
request_buf, total_len);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 5/6] cifs: dns resolution is needed only for primary channel
2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
` (2 preceding siblings ...)
2025-06-02 17:07 ` [PATCH 4/6] cifs: serialize other channels when query server interfaces is pending nspmangalore
@ 2025-06-02 17:07 ` nspmangalore
2025-06-02 17:07 ` [PATCH 6/6] cifs: do not disable interface polling on failure nspmangalore
2025-06-02 18:47 ` [PATCH 1/6] cifs: deal with the channel loading lag while picking channels Steve French
5 siblings, 0 replies; 7+ messages in thread
From: nspmangalore @ 2025-06-02 17:07 UTC (permalink / raw)
To: linux-cifs, smfrench, bharathsm.hsk, meetakshisetiyaoss, pc
Cc: Shyam Prasad N
From: Shyam Prasad N <sprasad@microsoft.com>
When calling cifs_reconnect, before the connection to the
server is reestablished, the code today does a DNS resolution and
updates server->dstaddr.
However, this is not necessary for secondary channels. Secondary
channels use the interface list returned by the server to decide
which address to connect to. And that happens after tcon is reconnected
and server interfaces are requested.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/smb/client/connect.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index ca1cb01c6ef8..024817d40c5f 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -392,7 +392,8 @@ static int __cifs_reconnect(struct TCP_Server_Info *server,
try_to_freeze();
cifs_server_lock(server);
- if (!cifs_swn_set_server_dstaddr(server)) {
+ if (!cifs_swn_set_server_dstaddr(server) &&
+ !SERVER_IS_CHAN(server)) {
/* resolve the hostname again to make sure that IP address is up-to-date */
rc = reconn_set_ipaddr_from_hostname(server);
cifs_dbg(FYI, "%s: reconn_set_ipaddr_from_hostname: rc=%d\n", __func__, rc);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 6/6] cifs: do not disable interface polling on failure
2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
` (3 preceding siblings ...)
2025-06-02 17:07 ` [PATCH 5/6] cifs: dns resolution is needed only for primary channel nspmangalore
@ 2025-06-02 17:07 ` nspmangalore
2025-06-02 18:47 ` [PATCH 1/6] cifs: deal with the channel loading lag while picking channels Steve French
5 siblings, 0 replies; 7+ messages in thread
From: nspmangalore @ 2025-06-02 17:07 UTC (permalink / raw)
To: linux-cifs, smfrench, bharathsm.hsk, meetakshisetiyaoss, pc
Cc: Shyam Prasad N
From: Shyam Prasad N <sprasad@microsoft.com>
When a server has multichannel enabled, we keep polling the server
for interfaces periodically. However, when this query fails, we
disable the polling. This can be problematic as it takes away the
chance for the server to start advertizing again.
This change reschedules the delayed work, even if the current call
failed. That way, multichannel sessions can recover.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/smb/client/connect.c | 6 +-----
fs/smb/client/smb2pdu.c | 9 +++++----
2 files changed, 6 insertions(+), 9 deletions(-)
diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index 024817d40c5f..28bc33496623 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -116,13 +116,9 @@ static void smb2_query_server_interfaces(struct work_struct *work)
rc = server->ops->query_server_interfaces(xid, tcon, false);
free_xid(xid);
- if (rc) {
- if (rc == -EOPNOTSUPP)
- return;
-
+ if (rc)
cifs_dbg(FYI, "%s: failed to query server interfaces: %d\n",
__func__, rc);
- }
queue_delayed_work(cifsiod_wq, &tcon->query_interfaces,
(SMB_INTERFACE_POLL_INTERVAL * HZ));
diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
index 59a6b86c3786..50c9e7ba15b0 100644
--- a/fs/smb/client/smb2pdu.c
+++ b/fs/smb/client/smb2pdu.c
@@ -423,6 +423,10 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
free_xid(xid);
ses->flags &= ~CIFS_SES_FLAGS_PENDING_QUERY_INTERFACES;
+ /* regardless of rc value, setup polling */
+ queue_delayed_work(cifsiod_wq, &tcon->query_interfaces,
+ (SMB_INTERFACE_POLL_INTERVAL * HZ));
+
mutex_unlock(&ses->session_mutex);
if (rc == -EOPNOTSUPP && ses->chan_count > 1) {
@@ -443,11 +447,8 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
if (ses->chan_max > ses->chan_count &&
ses->iface_count &&
!SERVER_IS_CHAN(server)) {
- if (ses->chan_count == 1) {
+ if (ses->chan_count == 1)
cifs_server_dbg(VFS, "supports multichannel now\n");
- queue_delayed_work(cifsiod_wq, &tcon->query_interfaces,
- (SMB_INTERFACE_POLL_INTERVAL * HZ));
- }
cifs_try_adding_channels(ses);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 1/6] cifs: deal with the channel loading lag while picking channels
2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
` (4 preceding siblings ...)
2025-06-02 17:07 ` [PATCH 6/6] cifs: do not disable interface polling on failure nspmangalore
@ 2025-06-02 18:47 ` Steve French
5 siblings, 0 replies; 7+ messages in thread
From: Steve French @ 2025-06-02 18:47 UTC (permalink / raw)
To: nspmangalore
Cc: linux-cifs, bharathsm.hsk, meetakshisetiyaoss, pc, Shyam Prasad N,
stable
Merged all six of these multichannel fixes into cifs-2.6.git for-next
pending additional testing and any review comments but these three in
particular looked most obvious/safe:
b4f60a053a25 cifs: dns resolution is needed only for primary channel
c1846893991f cifs: update dstaddr whenever channel iface is updated
1f396b9bfe39 cifs: reset connections for all channels when reconnect requested
I want to look more carefully especially at these three:
2f2c5d38fb9d (HEAD -> for-next, origin/for-next) cifs: do not disable
interface polling on failure
4394c936623d cifs: serialize other channels when query server
interfaces is pending
bf75ad3631c7 cifs: deal with the channel loading lag while picking channels
On Mon, Jun 2, 2025 at 12:09 PM <nspmangalore@gmail.com> wrote:
>
> From: Shyam Prasad N <sprasad@microsoft.com>
>
> Our current approach to select a channel for sending requests is this:
> 1. iterate all channels to find the min and max queue depth
> 2. if min and max are not the same, pick the channel with min depth
> 3. if min and max are same, round robin, as all channels are equally loaded
>
> The problem with this approach is that there's a lag between selecting
> a channel and sending the request (that increases the queue depth on the channel).
> While these numbers will eventually catch up, there could be a skew in the
> channel usage, depending on the application's I/O parallelism and the server's
> speed of handling requests.
>
> With sufficient parallelism, this lag can artificially increase the queue depth,
> thereby impacting the performance negatively.
>
> This change will change the step 1 above to start the iteration from the last
> selected channel. This is to reduce the skew in channel usage even in the presence
> of this lag.
>
> Fixes: ea90708d3cf3 ("cifs: use the least loaded channel for sending requests")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> ---
> fs/smb/client/transport.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c
> index 266af17aa7d9..191783f553ce 100644
> --- a/fs/smb/client/transport.c
> +++ b/fs/smb/client/transport.c
> @@ -1018,14 +1018,16 @@ struct TCP_Server_Info *cifs_pick_channel(struct cifs_ses *ses)
> uint index = 0;
> unsigned int min_in_flight = UINT_MAX, max_in_flight = 0;
> struct TCP_Server_Info *server = NULL;
> - int i;
> + int i, start, cur;
>
> if (!ses)
> return NULL;
>
> spin_lock(&ses->chan_lock);
> + start = atomic_inc_return(&ses->chan_seq);
> for (i = 0; i < ses->chan_count; i++) {
> - server = ses->chans[i].server;
> + cur = (start + i) % ses->chan_count;
> + server = ses->chans[cur].server;
> if (!server || server->terminate)
> continue;
>
> @@ -1042,17 +1044,15 @@ struct TCP_Server_Info *cifs_pick_channel(struct cifs_ses *ses)
> */
> if (server->in_flight < min_in_flight) {
> min_in_flight = server->in_flight;
> - index = i;
> + index = cur;
> }
> if (server->in_flight > max_in_flight)
> max_in_flight = server->in_flight;
> }
>
> /* if all channels are equally loaded, fall back to round-robin */
> - if (min_in_flight == max_in_flight) {
> - index = (uint)atomic_inc_return(&ses->chan_seq);
> - index %= ses->chan_count;
> - }
> + if (min_in_flight == max_in_flight)
> + index = (uint)start % ses->chan_count;
>
> server = ses->chans[index].server;
> spin_unlock(&ses->chan_lock);
> --
> 2.43.0
>
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 7+ messages in thread