public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chen Yu <yu.c.chen@intel.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ming Lei <ming.lei@redhat.com>,
	"Rei Yamamoto" <yamamoto.rei@jp.fujitsu.com>
Cc: <linux-kernel@vger.kernel.org>,
	Sasha Neftin <sasha.neftin@intel.com>,
	"Lifshits, Vitaly" <vitaly.lifshits@intel.com>,
	"Ruinskiy, Dima" <dima.ruinskiy@intel.com>
Subject: Managed interrupt spread based on Numa locality
Date: Fri, 2 Feb 2024 14:55:33 +0800	[thread overview]
Message-ID: <ZbyR5RCFWDORmkBk@chenyu5-mobl2> (raw)

Dear experts,
 
Recently we are evaluating some multi-queue NIC device drivers,
to switch them from conventional interrupt to managed interrupt.

In this way, the managed interrupts do not have to be migrated
during CPU offline, when the last CPU of the cpumask is offline.

This can save the space of vectors and help the hibernation put
every nonboot CPUs offline. Otherwise, an error would occur:
[48175.409994] CPU 239 has 165 vectors, 36 available. Cannot disable CPU

However after switching to the managed interrupt, there is a question
about the interrupt spreading among numa nodes.

If the device d1 is attached to node n1, can d1's managed interrupts
be allocated on the CPUs of n1 first? In this way, we can let
the driver of d1 to allocate buffer on n1, and with the managed
interrupt of d1 delivered to CPUs on n1, the path of DMA->DRAM->CPU
->net_rx_action() would be Numa friendly.

Question:
Does it make sense to make the interrupt spreading aware of numa
locality, or is there existing mechanism to do this? The driver can provide
the preferred numa node in the struct irq_affinity->node, and passed it to
the managed interrupt spreading logic, then the interrupts are spread within
that node.

Thanks in advance.

diff --git a/lib/group_cpus.c b/lib/group_cpus.c
index aa3f6815bb12..836e9d374c19 100644
--- a/lib/group_cpus.c
+++ b/lib/group_cpus.c
@@ -344,7 +344,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps,
  * We guarantee in the resulted grouping that all CPUs are covered, and
  * no same CPU is assigned to multiple groups
  */
-struct cpumask *group_cpus_evenly(unsigned int numgrps)
+struct cpumask *group_cpus_evenly(unsigned int numgrps, int node)
 {
 	unsigned int curgrp = 0, nr_present = 0, nr_others = 0;
 	cpumask_var_t *node_to_cpumask;
@@ -370,9 +370,14 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps)
 	cpus_read_lock();
 	build_node_to_cpumask(node_to_cpumask);
 
+	if (node != NUMA_NO_NODE)
+		cpumask_and(npresmsk, cpu_present_mask, node_to_cpumask[node]);
+	else
+		cpumask_copy(npresmsk, cpu_present_mask);
+
 	/* grouping present CPUs first */
 	ret = __group_cpus_evenly(curgrp, numgrps, node_to_cpumask,
-				  cpu_present_mask, nmsk, masks);
+				  npresmsk, nmsk, masks);
 	if (ret < 0)
 		goto fail_build_affinity;
 	nr_present = ret;
-- 
2.25.1

                 reply	other threads:[~2024-02-02  6:55 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZbyR5RCFWDORmkBk@chenyu5-mobl2 \
    --to=yu.c.chen@intel.com \
    --cc=dima.ruinskiy@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=sasha.neftin@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vitaly.lifshits@intel.com \
    --cc=yamamoto.rei@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox