From: Jakub Kicinski <kuba@kernel.org>
To: Shradha Gupta <shradhagupta@linux.microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>,
Shradha Gupta <shradhagupta@microsoft.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>,
Ajay Sharma <sharmaajay@microsoft.com>,
Leon Romanovsky <leon@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
KY Srinivasan <kys@microsoft.com>, Wei Liu <wei.liu@kernel.org>,
Dexuan Cui <decui@microsoft.com>, Long Li <longli@microsoft.com>,
Michael Kelley <mikelley@microsoft.com>
Subject: Re: [PATCH] net :mana : Add per-cpu stats for MANA device
Date: Thu, 14 Mar 2024 11:27:34 -0700 [thread overview]
Message-ID: <20240314112734.5f1c9f7e@kernel.org> (raw)
In-Reply-To: <20240314025720.GA13853@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
On Wed, 13 Mar 2024 19:57:20 -0700 Shradha Gupta wrote:
> Default interrupts affinity for each queue:
>
> 25: 1 103 0 2989138 Hyper-V PCIe MSI 4138200989697-edge mana_q0@pci:7870:00:00.0
> 26: 0 1 4005360 0 Hyper-V PCIe MSI 4138200989698-edge mana_q1@pci:7870:00:00.0
> 27: 0 0 1 2997584 Hyper-V PCIe MSI 4138200989699-edge mana_q2@pci:7870:00:00.0
> 28: 3565461 0 0 1 Hyper-V PCIe MSI 4138200989700-edge mana_q3
> @pci:7870:00:00.0
>
> As seen the CPU-queue mapping is not 1:1, Queue 0 and Queue 2 are both mapped
> to cpu3. From this knowledge we can figure out the total RX stats processed by
> each CPU by adding the values of mana_q0 and mana_q2 stats for cpu3. But if
> this data changes dynamically using irqbalance or smp_affinity file edits, the
> above assumption fails.
>
> Interrupt affinity for mana_q2 changes and the affinity table looks as follows
> 25: 1 103 0 3038084 Hyper-V PCIe MSI 4138200989697-edge mana_q0@pci:7870:00:00.0
> 26: 0 1 4012447 0 Hyper-V PCIe MSI 4138200989698-edge mana_q1@pci:7870:00:00.0
> 27: 157181 10 1 3007990 Hyper-V PCIe MSI 4138200989699-edge mana_q2@pci:7870:00:00.0
> 28: 3593858 0 0 1 Hyper-V PCIe MSI 4138200989700-edge mana_q3@pci:7870:00:00.0
>
> And during this time we might end up calculating the per-CPU stats incorrectly,
> messing up the understanding of CPU usage by MANA driver that is consumed by
> monitoring services.
Like Stephen said, forget about irqbalance for networking.
Assume that the IRQs are affinitized and XPS set, correctly.
Now, presumably you can use your pcpu stats to "trade queues",
e.g. 4 CPUs / 4 queues, if CPU 0 insists on using queue 1
instead of queue 0, you can swap the 0 <> 1 assignment.
That's just an example of an "algorithm", maybe you have other
use cases. But if the problem is "user runs broken irqbalance"
the solution is not in the kernel...
next prev parent reply other threads:[~2024-03-14 18:27 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-07 14:52 [PATCH] net :mana : Add per-cpu stats for MANA device Shradha Gupta
2024-03-07 15:29 ` Jakub Kicinski
2024-03-07 15:49 ` Haiyang Zhang
2024-03-07 17:01 ` Jakub Kicinski
2024-03-08 5:30 ` Shradha Gupta
2024-03-08 18:51 ` Haiyang Zhang
2024-03-08 19:22 ` Jakub Kicinski
2024-03-08 19:43 ` Haiyang Zhang
2024-03-08 19:52 ` Rahul Rameshbabu
2024-03-08 20:27 ` Jakub Kicinski
2024-03-08 20:33 ` Sebastian Andrzej Siewior
2024-03-11 4:19 ` Shradha Gupta
2024-03-11 15:49 ` Stephen Hemminger
2024-03-11 15:51 ` Jakub Kicinski
2024-03-11 16:41 ` Stephen Hemminger
2024-03-14 2:57 ` Shradha Gupta
2024-03-14 3:05 ` Stephen Hemminger
2024-03-14 18:27 ` Jakub Kicinski [this message]
2024-03-14 18:54 ` Haiyang Zhang
2024-03-14 19:05 ` Jakub Kicinski
2024-03-14 20:01 ` [EXTERNAL] " Alireza Dabagh
2024-04-03 5:43 ` Shradha Gupta
2024-03-07 16:17 ` Haiyang Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240314112734.5f1c9f7e@kernel.org \
--to=kuba@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=decui@microsoft.com \
--cc=edumazet@google.com \
--cc=haiyangz@microsoft.com \
--cc=kys@microsoft.com \
--cc=leon@kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=mikelley@microsoft.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sharmaajay@microsoft.com \
--cc=shradhagupta@linux.microsoft.com \
--cc=shradhagupta@microsoft.com \
--cc=tglx@linutronix.de \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).