From: Souradeep Chakrabarti <schakrabarti@microsoft.com>
To: Yury Norov <yury.norov@gmail.com>,
Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>,
KY Srinivasan <kys@microsoft.com>,
Haiyang Zhang <haiyangz@microsoft.com>,
"wei.liu@kernel.org" <wei.liu@kernel.org>,
Dexuan Cui <decui@microsoft.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"edumazet@google.com" <edumazet@google.com>,
"kuba@kernel.org" <kuba@kernel.org>,
"pabeni@redhat.com" <pabeni@redhat.com>,
Long Li <longli@microsoft.com>,
"leon@kernel.org" <leon@kernel.org>,
"cai.huoqing@linux.dev" <cai.huoqing@linux.dev>,
"ssengar@linux.microsoft.com" <ssengar@linux.microsoft.com>,
"vkuznets@redhat.com" <vkuznets@redhat.com>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Cc: Paul Rosswurm <paulros@microsoft.com>
Subject: RE: [EXTERNAL] [PATCH 3/3] net: mana: add a function to spread IRQs per CPUs
Date: Tue, 19 Dec 2023 10:18:49 +0000 [thread overview]
Message-ID: <PUZP153MB07886CE88351F6B7A2AA0096CC97A@PUZP153MB0788.APCP153.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <20231217213214.1905481-4-yury.norov@gmail.com>
>-----Original Message-----
>From: Yury Norov <yury.norov@gmail.com>
>Sent: Monday, December 18, 2023 3:02 AM
>To: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>; KY Srinivasan
><kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
>wei.liu@kernel.org; Dexuan Cui <decui@microsoft.com>; davem@davemloft.net;
>edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Long Li
><longli@microsoft.com>; yury.norov@gmail.com; leon@kernel.org;
>cai.huoqing@linux.dev; ssengar@linux.microsoft.com; vkuznets@redhat.com;
>tglx@linutronix.de; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; linux-
>kernel@vger.kernel.org; linux-rdma@vger.kernel.org
>Cc: Souradeep Chakrabarti <schakrabarti@microsoft.com>; Paul Rosswurm
><paulros@microsoft.com>
>Subject: [EXTERNAL] [PATCH 3/3] net: mana: add a function to spread IRQs per
>CPUs
>
>[Some people who received this message don't often get email from
>yury.norov@gmail.com. Learn why this is important at
>https://aka.ms/LearnAboutSenderIdentification ]
>
>Souradeep investigated that the driver performs faster if IRQs are spread on CPUs
>with the following heuristics:
>
>1. No more than one IRQ per CPU, if possible; 2. NUMA locality is the second
>priority; 3. Sibling dislocality is the last priority.
>
>Let's consider this topology:
>
>Node 0 1
>Core 0 1 2 3
>CPU 0 1 2 3 4 5 6 7
>
>The most performant IRQ distribution based on the above topology and heuristics
>may look like this:
>
>IRQ Nodes Cores CPUs
>0 1 0 0-1
>1 1 1 2-3
>2 1 0 0-1
>3 1 1 2-3
>4 2 2 4-5
>5 2 3 6-7
>6 2 2 4-5
>7 2 3 6-7
>
>The irq_setup() routine introduced in this patch leverages the
>for_each_numa_hop_mask() iterator and assigns IRQs to sibling groups as
>described above.
>
>According to [1], for NUMA-aware but sibling-ignorant IRQ distribution based on
>cpumask_local_spread() performance test results look like this:
>
>./ntttcp -r -m 16
>NTTTCP for Linux 1.4.0
>---------------------------------------------------------
>08:05:20 INFO: 17 threads created
>08:05:28 INFO: Network activity progressing...
>08:06:28 INFO: Test run completed.
>08:06:28 INFO: Test cycle finished.
>08:06:28 INFO: ##### Totals: #####
>08:06:28 INFO: test duration :60.00 seconds
>08:06:28 INFO: total bytes :630292053310
>08:06:28 INFO: throughput :84.04Gbps
>08:06:28 INFO: retrans segs :4
>08:06:28 INFO: cpu cores :192
>08:06:28 INFO: cpu speed :3799.725MHz
>08:06:28 INFO: user :0.05%
>08:06:28 INFO: system :1.60%
>08:06:28 INFO: idle :96.41%
>08:06:28 INFO: iowait :0.00%
>08:06:28 INFO: softirq :1.94%
>08:06:28 INFO: cycles/byte :2.50
>08:06:28 INFO: cpu busy (all) :534.41%
>
>For NUMA- and sibling-aware IRQ distribution, the same test works 15% faster:
>
>./ntttcp -r -m 16
>NTTTCP for Linux 1.4.0
>---------------------------------------------------------
>08:08:51 INFO: 17 threads created
>08:08:56 INFO: Network activity progressing...
>08:09:56 INFO: Test run completed.
>08:09:56 INFO: Test cycle finished.
>08:09:56 INFO: ##### Totals: #####
>08:09:56 INFO: test duration :60.00 seconds
>08:09:56 INFO: total bytes :741966608384
>08:09:56 INFO: throughput :98.93Gbps
>08:09:56 INFO: retrans segs :6
>08:09:56 INFO: cpu cores :192
>08:09:56 INFO: cpu speed :3799.791MHz
>08:09:56 INFO: user :0.06%
>08:09:56 INFO: system :1.81%
>08:09:56 INFO: idle :96.18%
>08:09:56 INFO: iowait :0.00%
>08:09:56 INFO: softirq :1.95%
>08:09:56 INFO: cycles/byte :2.25
>08:09:56 INFO: cpu busy (all) :569.22%
>
>[1]
>https://lore.kernel/
>.org%2Fall%2F20231211063726.GA4977%40linuxonhyperv3.guj3yctzbm1etfxqx2v
>ob5hsef.xx.internal.cloudapp.net%2F&data=05%7C02%7Cschakrabarti%40micros
>oft.com%7Ca385a5a5d661458219c208dbff47a7ab%7C72f988bf86f141af91ab2d7
>cd011db47%7C1%7C0%7C638384455520036393%7CUnknown%7CTWFpbGZsb3d
>8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>7C3000%7C%7C%7C&sdata=kzoalzSu6frB0GIaUM5VWsz04%2FsB%2FBdXwXKb26
>IhqkE%3D&reserved=0
>
>Signed-off-by: Yury Norov <yury.norov@gmail.com>
>Co-developed-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
>---
> .../net/ethernet/microsoft/mana/gdma_main.c | 28 +++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
>diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
>b/drivers/net/ethernet/microsoft/mana/gdma_main.c
>index 6367de0c2c2e..11e64e42e3b2 100644
>--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
>+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
>@@ -1243,6 +1243,34 @@ void mana_gd_free_res_map(struct gdma_resource
>*r)
> r->size = 0;
> }
>
>+static __maybe_unused int irq_setup(unsigned int *irqs, unsigned int
>+len, int node) {
>+ const struct cpumask *next, *prev = cpu_none_mask;
>+ cpumask_var_t cpus __free(free_cpumask_var);
>+ int cpu, weight;
>+
>+ if (!alloc_cpumask_var(&cpus, GFP_KERNEL))
>+ return -ENOMEM;
>+
>+ rcu_read_lock();
>+ for_each_numa_hop_mask(next, node) {
>+ weight = cpumask_weight_andnot(next, prev);
>+ while (weight-- > 0) {
Make it while (weight > 0) {
>+ cpumask_andnot(cpus, next, prev);
>+ for_each_cpu(cpu, cpus) {
>+ if (len-- == 0)
>+ goto done;
>+ irq_set_affinity_and_hint(*irqs++,
>topology_sibling_cpumask(cpu));
>+ cpumask_andnot(cpus, cpus, topology_sibling_cpumask(cpu));
Here do --weight, else this code will traverse the same node N^2 times, where each
node has N cpus .
>+ }
>+ }
>+ prev = next;
>+ }
>+done:
>+ rcu_read_unlock();
>+ return 0;
>+}
>+
> static int mana_gd_setup_irqs(struct pci_dev *pdev) {
> unsigned int max_queues_per_port = num_online_cpus();
>--
>2.40.1
next prev parent reply other threads:[~2023-12-19 10:18 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-17 21:32 [PATCH 0/3] net: mana: add irq_spread() Yury Norov
2023-12-17 21:32 ` [PATCH 1/3] cpumask: add cpumask_weight_andnot() Yury Norov
2023-12-18 21:19 ` Jacob Keller
2023-12-17 21:32 ` [PATCH 2/3] cpumask: define cleanup function for cpumasks Yury Norov
2023-12-17 21:32 ` [PATCH 3/3] net: mana: add a function to spread IRQs per CPUs Yury Norov
2023-12-18 21:17 ` Jacob Keller
2023-12-18 21:42 ` Yury Norov
2023-12-19 7:14 ` [EXTERNAL] " Souradeep Chakrabarti
2023-12-19 10:18 ` Souradeep Chakrabarti [this message]
2023-12-19 14:03 ` Yury Norov
2023-12-18 21:18 ` [PATCH 0/3] net: mana: add irq_spread() Jacob Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PUZP153MB07886CE88351F6B7A2AA0096CC97A@PUZP153MB0788.APCP153.PROD.OUTLOOK.COM \
--to=schakrabarti@microsoft.com \
--cc=cai.huoqing@linux.dev \
--cc=davem@davemloft.net \
--cc=decui@microsoft.com \
--cc=edumazet@google.com \
--cc=haiyangz@microsoft.com \
--cc=kuba@kernel.org \
--cc=kys@microsoft.com \
--cc=leon@kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=paulros@microsoft.com \
--cc=schakrabarti@linux.microsoft.com \
--cc=ssengar@linux.microsoft.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=wei.liu@kernel.org \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox