Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [PATCH 1/6] cifs: deal with the channel loading lag while picking channels
@ 2025-06-02 17:07 nspmangalore
  2025-06-02 17:07 ` [PATCH 2/6] cifs: reset connections for all channels when reconnect requested nspmangalore
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: nspmangalore @ 2025-06-02 17:07 UTC (permalink / raw)
  To: linux-cifs, smfrench, bharathsm.hsk, meetakshisetiyaoss, pc
  Cc: Shyam Prasad N, stable

From: Shyam Prasad N <sprasad@microsoft.com>

Our current approach to select a channel for sending requests is this:
1. iterate all channels to find the min and max queue depth
2. if min and max are not the same, pick the channel with min depth
3. if min and max are same, round robin, as all channels are equally loaded

The problem with this approach is that there's a lag between selecting
a channel and sending the request (that increases the queue depth on the channel).
While these numbers will eventually catch up, there could be a skew in the
channel usage, depending on the application's I/O parallelism and the server's
speed of handling requests.

With sufficient parallelism, this lag can artificially increase the queue depth,
thereby impacting the performance negatively.

This change will change the step 1 above to start the iteration from the last
selected channel. This is to reduce the skew in channel usage even in the presence
of this lag.

Fixes: ea90708d3cf3 ("cifs: use the least loaded channel for sending requests")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/smb/client/transport.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c
index 266af17aa7d9..191783f553ce 100644
--- a/fs/smb/client/transport.c
+++ b/fs/smb/client/transport.c
@@ -1018,14 +1018,16 @@ struct TCP_Server_Info *cifs_pick_channel(struct cifs_ses *ses)
 	uint index = 0;
 	unsigned int min_in_flight = UINT_MAX, max_in_flight = 0;
 	struct TCP_Server_Info *server = NULL;
-	int i;
+	int i, start, cur;
 
 	if (!ses)
 		return NULL;
 
 	spin_lock(&ses->chan_lock);
+	start = atomic_inc_return(&ses->chan_seq);
 	for (i = 0; i < ses->chan_count; i++) {
-		server = ses->chans[i].server;
+		cur = (start + i) % ses->chan_count;
+		server = ses->chans[cur].server;
 		if (!server || server->terminate)
 			continue;
 
@@ -1042,17 +1044,15 @@ struct TCP_Server_Info *cifs_pick_channel(struct cifs_ses *ses)
 		 */
 		if (server->in_flight < min_in_flight) {
 			min_in_flight = server->in_flight;
-			index = i;
+			index = cur;
 		}
 		if (server->in_flight > max_in_flight)
 			max_in_flight = server->in_flight;
 	}
 
 	/* if all channels are equally loaded, fall back to round-robin */
-	if (min_in_flight == max_in_flight) {
-		index = (uint)atomic_inc_return(&ses->chan_seq);
-		index %= ses->chan_count;
-	}
+	if (min_in_flight == max_in_flight)
+		index = (uint)start % ses->chan_count;
 
 	server = ses->chans[index].server;
 	spin_unlock(&ses->chan_lock);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/6] cifs: reset connections for all channels when reconnect requested
  2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
@ 2025-06-02 17:07 ` nspmangalore
  2025-06-02 17:07 ` [PATCH 3/6] cifs: update dstaddr whenever channel iface is updated nspmangalore
  2025-06-02 18:47 ` [PATCH 1/6] cifs: deal with the channel loading lag while picking channels Steve French
  2 siblings, 0 replies; 4+ messages in thread
From: nspmangalore @ 2025-06-02 17:07 UTC (permalink / raw)
  To: linux-cifs, smfrench, bharathsm.hsk, meetakshisetiyaoss, pc
  Cc: Shyam Prasad N, stable

From: Shyam Prasad N <sprasad@microsoft.com>

cifs_reconnect can be called with a flag to mark the session as needing
reconnect too. When this is done, we expect the connections of all
channels to be reconnected too, which is not happening today.

Without doing this, we have seen bad things happen when primary and
secondary channels are connected to different servers (in case of cloud
services like Azure Files SMB).

This change would force all connections to reconnect as well, not just
the sessions and tcons.

Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/smb/client/connect.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
index 6bf04d9a5491..ca1cb01c6ef8 100644
--- a/fs/smb/client/connect.c
+++ b/fs/smb/client/connect.c
@@ -377,6 +377,13 @@ static int __cifs_reconnect(struct TCP_Server_Info *server,
 	if (!cifs_tcp_ses_needs_reconnect(server, 1))
 		return 0;
 
+	/*
+	 * if smb session has been marked for reconnect, also reconnect all
+	 * connections. This way, the other connections do not end up bad.
+	 */
+	if (mark_smb_session)
+		cifs_signal_cifsd_for_reconnect(server, mark_smb_session);
+
 	cifs_mark_tcp_ses_conns_for_reconnect(server, mark_smb_session);
 
 	cifs_abort_connection(server);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/6] cifs: update dstaddr whenever channel iface is updated
  2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
  2025-06-02 17:07 ` [PATCH 2/6] cifs: reset connections for all channels when reconnect requested nspmangalore
@ 2025-06-02 17:07 ` nspmangalore
  2025-06-02 18:47 ` [PATCH 1/6] cifs: deal with the channel loading lag while picking channels Steve French
  2 siblings, 0 replies; 4+ messages in thread
From: nspmangalore @ 2025-06-02 17:07 UTC (permalink / raw)
  To: linux-cifs, smfrench, bharathsm.hsk, meetakshisetiyaoss, pc
  Cc: Shyam Prasad N, stable

From: Shyam Prasad N <sprasad@microsoft.com>

When the server interface info changes (more common in clustered
servers like Azure Files), the per-channel iface gets updated.
However, this did not update the corresponding dstaddr. As a result
these channels will still connect (or try connecting) to older addresses.

Fixes: b54034a73baf ("cifs: during reconnect, update interface if necessary")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/smb/client/sess.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/smb/client/sess.c b/fs/smb/client/sess.c
index b3fa9ee26912..8add3ba14e9f 100644
--- a/fs/smb/client/sess.c
+++ b/fs/smb/client/sess.c
@@ -445,6 +445,10 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
 
 	ses->chans[chan_index].iface = iface;
 	spin_unlock(&ses->chan_lock);
+
+	spin_lock(&server->srv_lock);
+	memcpy(&server->dstaddr, &iface->sockaddr, sizeof(server->dstaddr));
+	spin_unlock(&server->srv_lock);
 }
 
 static int
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/6] cifs: deal with the channel loading lag while picking channels
  2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
  2025-06-02 17:07 ` [PATCH 2/6] cifs: reset connections for all channels when reconnect requested nspmangalore
  2025-06-02 17:07 ` [PATCH 3/6] cifs: update dstaddr whenever channel iface is updated nspmangalore
@ 2025-06-02 18:47 ` Steve French
  2 siblings, 0 replies; 4+ messages in thread
From: Steve French @ 2025-06-02 18:47 UTC (permalink / raw)
  To: nspmangalore
  Cc: linux-cifs, bharathsm.hsk, meetakshisetiyaoss, pc, Shyam Prasad N,
	stable

Merged all six of these multichannel fixes into cifs-2.6.git for-next
pending additional testing and any review comments but these three in
particular looked most obvious/safe:

b4f60a053a25 cifs: dns resolution is needed only for primary channel
c1846893991f cifs: update dstaddr whenever channel iface is updated
1f396b9bfe39 cifs: reset connections for all channels when reconnect requested

I want to look more carefully especially at these three:
2f2c5d38fb9d (HEAD -> for-next, origin/for-next) cifs: do not disable
interface polling on failure
4394c936623d cifs: serialize other channels when query server
interfaces is pending
bf75ad3631c7 cifs: deal with the channel loading lag while picking channels

On Mon, Jun 2, 2025 at 12:09 PM <nspmangalore@gmail.com> wrote:
>
> From: Shyam Prasad N <sprasad@microsoft.com>
>
> Our current approach to select a channel for sending requests is this:
> 1. iterate all channels to find the min and max queue depth
> 2. if min and max are not the same, pick the channel with min depth
> 3. if min and max are same, round robin, as all channels are equally loaded
>
> The problem with this approach is that there's a lag between selecting
> a channel and sending the request (that increases the queue depth on the channel).
> While these numbers will eventually catch up, there could be a skew in the
> channel usage, depending on the application's I/O parallelism and the server's
> speed of handling requests.
>
> With sufficient parallelism, this lag can artificially increase the queue depth,
> thereby impacting the performance negatively.
>
> This change will change the step 1 above to start the iteration from the last
> selected channel. This is to reduce the skew in channel usage even in the presence
> of this lag.
>
> Fixes: ea90708d3cf3 ("cifs: use the least loaded channel for sending requests")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> ---
>  fs/smb/client/transport.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c
> index 266af17aa7d9..191783f553ce 100644
> --- a/fs/smb/client/transport.c
> +++ b/fs/smb/client/transport.c
> @@ -1018,14 +1018,16 @@ struct TCP_Server_Info *cifs_pick_channel(struct cifs_ses *ses)
>         uint index = 0;
>         unsigned int min_in_flight = UINT_MAX, max_in_flight = 0;
>         struct TCP_Server_Info *server = NULL;
> -       int i;
> +       int i, start, cur;
>
>         if (!ses)
>                 return NULL;
>
>         spin_lock(&ses->chan_lock);
> +       start = atomic_inc_return(&ses->chan_seq);
>         for (i = 0; i < ses->chan_count; i++) {
> -               server = ses->chans[i].server;
> +               cur = (start + i) % ses->chan_count;
> +               server = ses->chans[cur].server;
>                 if (!server || server->terminate)
>                         continue;
>
> @@ -1042,17 +1044,15 @@ struct TCP_Server_Info *cifs_pick_channel(struct cifs_ses *ses)
>                  */
>                 if (server->in_flight < min_in_flight) {
>                         min_in_flight = server->in_flight;
> -                       index = i;
> +                       index = cur;
>                 }
>                 if (server->in_flight > max_in_flight)
>                         max_in_flight = server->in_flight;
>         }
>
>         /* if all channels are equally loaded, fall back to round-robin */
> -       if (min_in_flight == max_in_flight) {
> -               index = (uint)atomic_inc_return(&ses->chan_seq);
> -               index %= ses->chan_count;
> -       }
> +       if (min_in_flight == max_in_flight)
> +               index = (uint)start % ses->chan_count;
>
>         server = ses->chans[index].server;
>         spin_unlock(&ses->chan_lock);
> --
> 2.43.0
>


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-06-02 18:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-02 17:07 [PATCH 1/6] cifs: deal with the channel loading lag while picking channels nspmangalore
2025-06-02 17:07 ` [PATCH 2/6] cifs: reset connections for all channels when reconnect requested nspmangalore
2025-06-02 17:07 ` [PATCH 3/6] cifs: update dstaddr whenever channel iface is updated nspmangalore
2025-06-02 18:47 ` [PATCH 1/6] cifs: deal with the channel loading lag while picking channels Steve French

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox