Linux block layer
 help / color / mirror / Atom feed
From: Aaron Tomlin <atomlin@atomlin.com>
To: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me,
	mst@redhat.com
Cc: atomlin@atomlin.com, aacraid@microsemi.com,
	James.Bottomley@HansenPartnership.com,
	martin.petersen@oracle.com, liyihang9@h-partners.com,
	kashyap.desai@broadcom.com, sumit.saxena@broadcom.com,
	shivasharan.srikanteshwara@broadcom.com,
	chandrakanth.patil@broadcom.com, sathya.prakash@broadcom.com,
	sreekanth.reddy@broadcom.com,
	suganath-prabu.subramani@broadcom.com, ranjan.kumar@broadcom.com,
	jinpu.wang@cloud.ionos.com, tglx@kernel.org, mingo@redhat.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, akpm@linux-foundation.org,
	maz@kernel.org, ruanjinjie@huawei.com, bigeasy@linutronix.de,
	yphbchou0911@gmail.com, wagi@kernel.org, frederic@kernel.org,
	longman@redhat.com, chenridong@huawei.com, hare@suse.de,
	kch@nvidia.com, ming.lei@redhat.com, tom.leiming@gmail.com,
	steve@abita.co, sean@ashe.io, chjohnst@gmail.com, neelx@suse.com,
	mproche@gmail.com, nick.lange@gmail.com,
	marco.crivellari@suse.com, rishil1999@outlook.com,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v13 7/8] genirq/affinity: Restrict managed IRQ affinity to housekeeping CPUs
Date: Tue, 12 May 2026 20:55:08 -0400	[thread overview]
Message-ID: <20260513005509.135966-8-atomlin@atomlin.com> (raw)
In-Reply-To: <20260513005509.135966-1-atomlin@atomlin.com>

At present, the managed interrupt spreading algorithm distributes vectors
across all available CPUs within a given node or system. On systems
employing CPU isolation (e.g., "isolcpus=io_queue"), this behaviour
defeats the primary purpose of isolation by routing hardware interrupts
(such as NVMe completion queues) directly to isolated cores.

Update irq_create_affinity_masks() to respect the housekeeping CPU mask.
By passing the HK_TYPE_IO_QUEUE mask directly to the topological
distribution function (group_mask_cpus_evenly()), we ensure that managed
interrupts are kept strictly off isolated CPUs.

This patch additionally addresses the architectural constraints of
restricted vector distribution:

    1. Vector Limits: Updated irq_calc_affinity_vectors() to bound the
       maximum number of allocated vectors to the weight of the
       housekeeping mask. This prevents drivers from wasting memory on
       dead hardware queues that cannot be routed to isolated CPUs.

    2. Multi-set Alignment: When isolation constraints result in fewer
       available masks than requested vectors for a given set, the
       remaining vector slots are padded with irq_default_affinity. The
       loop correctly advances by the requested vector count (this_vecs)
       to prevent shifting and corrupting the 1:1 hardware queue mapping
       for subsequent sets.

    3. Zero Overhead: The housekeeping mask is conditionally assigned via
       a direct pointer, completely avoiding temporary mask allocations
       (e.g., alloc_cpumask_var) and bitwise operations when CPU
       isolation is disabled. This guarantees zero performance or memory
       overhead for standard configurations.

Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
 kernel/irq/affinity.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 78f2418a8925..1d39dce685c7 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -8,6 +8,7 @@
 #include <linux/slab.h>
 #include <linux/cpu.h>
 #include <linux/group_cpus.h>
+#include <linux/sched/isolation.h>
 
 static void default_calc_sets(struct irq_affinity *affd, unsigned int affvecs)
 {
@@ -25,8 +26,10 @@ static void default_calc_sets(struct irq_affinity *affd, unsigned int affvecs)
 struct irq_affinity_desc *
 irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
 {
-	unsigned int affvecs, curvec, usedvecs, i;
+	unsigned int affvecs, curvec, usedvecs, i, j;
 	struct irq_affinity_desc *masks = NULL;
+	const struct cpumask *hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE);
+	bool hk_enabled = housekeeping_enabled(HK_TYPE_IO_QUEUE);
 
 	/*
 	 * Determine the number of vectors which need interrupt affinities
@@ -70,19 +73,29 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
 	 */
 	for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) {
 		unsigned int nr_masks, this_vecs = affd->set_size[i];
-		struct cpumask *result = group_cpus_evenly(this_vecs, &nr_masks);
+		struct cpumask *result;
+		const struct cpumask *mask;
 
+		if (hk_enabled)
+			mask = hk_mask;
+		else
+			mask = cpu_possible_mask;
+
+		result = group_mask_cpus_evenly(this_vecs, mask,
+						&nr_masks);
 		if (!result) {
 			kfree(masks);
 			return NULL;
 		}
-
-		for (int j = 0; j < nr_masks; j++)
+		for (j = 0; j < nr_masks; j++)
 			cpumask_copy(&masks[curvec + j].mask, &result[j]);
+		for (j = nr_masks; j < this_vecs; j++)
+			cpumask_copy(&masks[curvec + j].mask, irq_default_affinity);
+
 		kfree(result);
 
-		curvec += nr_masks;
-		usedvecs += nr_masks;
+		curvec += this_vecs;
+		usedvecs += this_vecs;
 	}
 
 	/* Fill out vectors at the end that don't need affinity */
@@ -115,10 +128,14 @@ unsigned int irq_calc_affinity_vectors(unsigned int minvec, unsigned int maxvec,
 	if (resv > minvec)
 		return 0;
 
-	if (affd->calc_sets)
+	if (affd->calc_sets) {
 		set_vecs = maxvec - resv;
-	else
-		set_vecs = cpumask_weight(cpu_possible_mask);
+	} else {
+		if (housekeeping_enabled(HK_TYPE_IO_QUEUE))
+			set_vecs = cpumask_weight(housekeeping_cpumask(HK_TYPE_IO_QUEUE));
+		else
+			set_vecs = cpumask_weight(cpu_possible_mask);
+	}
 
 	return resv + min(set_vecs, maxvec - resv);
 }
-- 
2.51.0


  parent reply	other threads:[~2026-05-13  0:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13  0:55 [PATCH v13 0/8] blk: honor isolcpus configuration Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 1/8] scsi: aacraid: use block layer helpers to calculate num of queues Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 2/8] lib/group_cpus: remove dead !SMP code Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 3/8] lib/group_cpus: Add group_mask_cpus_evenly() Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 4/8] isolation: Introduce io_queue isolcpus type Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 5/8] blk-mq: use hk cpus only when isolcpus=io_queue is enabled Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 6/8] blk-mq: prevent offlining hk CPUs with associated online isolated CPUs Aaron Tomlin
2026-05-13  0:55 ` Aaron Tomlin [this message]
2026-05-13  0:55 ` [PATCH v13 8/8] docs: add io_queue flag to isolcpus Aaron Tomlin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260513005509.135966-8-atomlin@atomlin.com \
    --to=atomlin@atomlin.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=aacraid@microsemi.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bigeasy@linutronix.de \
    --cc=chandrakanth.patil@broadcom.com \
    --cc=chenridong@huawei.com \
    --cc=chjohnst@gmail.com \
    --cc=frederic@kernel.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jinpu.wang@cloud.ionos.com \
    --cc=juri.lelli@redhat.com \
    --cc=kashyap.desai@broadcom.com \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liyihang9@h-partners.com \
    --cc=longman@redhat.com \
    --cc=marco.crivellari@suse.com \
    --cc=martin.petersen@oracle.com \
    --cc=maz@kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mingo@redhat.com \
    --cc=mproche@gmail.com \
    --cc=mst@redhat.com \
    --cc=neelx@suse.com \
    --cc=nick.lange@gmail.com \
    --cc=peterz@infradead.org \
    --cc=ranjan.kumar@broadcom.com \
    --cc=rishil1999@outlook.com \
    --cc=ruanjinjie@huawei.com \
    --cc=sagi@grimberg.me \
    --cc=sathya.prakash@broadcom.com \
    --cc=sean@ashe.io \
    --cc=shivasharan.srikanteshwara@broadcom.com \
    --cc=sreekanth.reddy@broadcom.com \
    --cc=steve@abita.co \
    --cc=suganath-prabu.subramani@broadcom.com \
    --cc=sumit.saxena@broadcom.com \
    --cc=tglx@kernel.org \
    --cc=tom.leiming@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=wagi@kernel.org \
    --cc=yphbchou0911@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox