From: longli@linux.microsoft.com
To: "K. Y. Srinivasan" <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
James Bottomley <JBottomley@Odin.com>,
linux-hyperv@vger.kernel.org, linux-scsi@vger.kernel.org,
linux-kernel@vger.kernel.org
Cc: Long Li <longli@microsoft.com>
Subject: [PATCH] scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU
Date: Wed, 1 Oct 2025 22:05:30 -0700 [thread overview]
Message-ID: <1759381530-7414-1-git-send-email-longli@linux.microsoft.com> (raw)
From: Long Li <longli@microsoft.com>
When selecting an outgoing channel for I/O, storvsc tries to select a
channel with a returning CPU that is not the same as issuing CPU. This
worked well in the past, however it doesn't work well when the Hyper-V
exposes a large number of channels (up to the number of all CPUs). Use
a different CPU for returning channel is not efficient on Hyper-V.
Change this behavior by preferring to the channel with the same CPU
as the current I/O issuing CPU whenever possible.
Tests have shown improvements in newer Hyper-V/Azure environment, and
no regression with older Hyper-V/Azure environments.
Tested-by: Raheel Abdul Faizy <rabdulfaizy@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
---
drivers/scsi/storvsc_drv.c | 96 ++++++++++++++++++--------------------
1 file changed, 45 insertions(+), 51 deletions(-)
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index d9e59204a9c3..092939791ea0 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1406,14 +1406,19 @@ static struct vmbus_channel *get_og_chn(struct storvsc_device *stor_device,
}
/*
- * Our channel array is sparsley populated and we
+ * Our channel array could be sparsley populated and we
* initiated I/O on a processor/hw-q that does not
* currently have a designated channel. Fix this.
* The strategy is simple:
- * I. Ensure NUMA locality
- * II. Distribute evenly (best effort)
+ * I. Prefer the channel associated with the current CPU
+ * II. Ensure NUMA locality
+ * III. Distribute evenly (best effort)
*/
+ /* Prefer the channel on the I/O issuing processor/hw-q */
+ if (cpumask_test_cpu(q_num, &stor_device->alloced_cpus))
+ return stor_device->stor_chns[q_num];
+
node_mask = cpumask_of_node(cpu_to_node(q_num));
num_channels = 0;
@@ -1469,59 +1474,48 @@ static int storvsc_do_io(struct hv_device *device,
/* See storvsc_change_target_cpu(). */
outgoing_channel = READ_ONCE(stor_device->stor_chns[q_num]);
if (outgoing_channel != NULL) {
- if (outgoing_channel->target_cpu == q_num) {
- /*
- * Ideally, we want to pick a different channel if
- * available on the same NUMA node.
- */
- node_mask = cpumask_of_node(cpu_to_node(q_num));
- for_each_cpu_wrap(tgt_cpu,
- &stor_device->alloced_cpus, q_num + 1) {
- if (!cpumask_test_cpu(tgt_cpu, node_mask))
- continue;
- if (tgt_cpu == q_num)
- continue;
- channel = READ_ONCE(
- stor_device->stor_chns[tgt_cpu]);
- if (channel == NULL)
- continue;
- if (hv_get_avail_to_write_percent(
- &channel->outbound)
- > ring_avail_percent_lowater) {
- outgoing_channel = channel;
- goto found_channel;
- }
- }
+ if (hv_get_avail_to_write_percent(&outgoing_channel->outbound)
+ > ring_avail_percent_lowater)
+ goto found_channel;
- /*
- * All the other channels on the same NUMA node are
- * busy. Try to use the channel on the current CPU
- */
- if (hv_get_avail_to_write_percent(
- &outgoing_channel->outbound)
- > ring_avail_percent_lowater)
+ /*
+ * Channel is busy, try to find a channel on the same NUMA node
+ */
+ node_mask = cpumask_of_node(cpu_to_node(q_num));
+ for_each_cpu_wrap(tgt_cpu, &stor_device->alloced_cpus,
+ q_num + 1) {
+ if (!cpumask_test_cpu(tgt_cpu, node_mask))
+ continue;
+ channel = READ_ONCE(stor_device->stor_chns[tgt_cpu]);
+ if (!channel)
+ continue;
+ if (hv_get_avail_to_write_percent(&channel->outbound)
+ > ring_avail_percent_lowater) {
+ outgoing_channel = channel;
goto found_channel;
+ }
+ }
- /*
- * If we reach here, all the channels on the current
- * NUMA node are busy. Try to find a channel in
- * other NUMA nodes
- */
- for_each_cpu(tgt_cpu, &stor_device->alloced_cpus) {
- if (cpumask_test_cpu(tgt_cpu, node_mask))
- continue;
- channel = READ_ONCE(
- stor_device->stor_chns[tgt_cpu]);
- if (channel == NULL)
- continue;
- if (hv_get_avail_to_write_percent(
- &channel->outbound)
- > ring_avail_percent_lowater) {
- outgoing_channel = channel;
- goto found_channel;
- }
+ /*
+ * If we reach here, all the channels on the current
+ * NUMA node are busy. Try to find a channel in
+ * all NUMA nodes
+ */
+ for_each_cpu_wrap(tgt_cpu, &stor_device->alloced_cpus,
+ q_num + 1) {
+ channel = READ_ONCE(stor_device->stor_chns[tgt_cpu]);
+ if (!channel)
+ continue;
+ if (hv_get_avail_to_write_percent(&channel->outbound)
+ > ring_avail_percent_lowater) {
+ outgoing_channel = channel;
+ goto found_channel;
}
}
+ /*
+ * If we reach here, all the channels are busy. Use the
+ * original channel found.
+ */
} else {
spin_lock_irqsave(&stor_device->lock, flags);
outgoing_channel = stor_device->stor_chns[q_num];
--
2.34.1
next reply other threads:[~2025-10-02 5:05 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-02 5:05 longli [this message]
2025-10-07 2:05 ` [PATCH] scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU Martin K. Petersen
2025-10-07 15:41 ` Michael Kelley
2025-10-08 0:55 ` Long Li
2025-10-08 15:30 ` Michael Kelley
2025-10-08 17:20 ` Long Li
2025-10-08 18:24 ` Michael Kelley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1759381530-7414-1-git-send-email-longli@linux.microsoft.com \
--to=longli@linux.microsoft.com \
--cc=JBottomley@Odin.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=decui@microsoft.com \
--cc=haiyangz@microsoft.com \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=martin.petersen@oracle.com \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).