From: Aaron Tomlin <atomlin@atomlin.com>
To: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me,
mst@redhat.com
Cc: atomlin@atomlin.com, aacraid@microsemi.com,
James.Bottomley@HansenPartnership.com,
martin.petersen@oracle.com, liyihang9@h-partners.com,
kashyap.desai@broadcom.com, sumit.saxena@broadcom.com,
shivasharan.srikanteshwara@broadcom.com,
chandrakanth.patil@broadcom.com, sathya.prakash@broadcom.com,
sreekanth.reddy@broadcom.com,
suganath-prabu.subramani@broadcom.com, ranjan.kumar@broadcom.com,
jinpu.wang@cloud.ionos.com, tglx@kernel.org, mingo@redhat.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, akpm@linux-foundation.org,
maz@kernel.org, ruanjinjie@huawei.com, bigeasy@linutronix.de,
yphbchou0911@gmail.com, wagi@kernel.org, frederic@kernel.org,
longman@redhat.com, chenridong@huawei.com, hare@suse.de,
kch@nvidia.com, ming.lei@redhat.com, tom.leiming@gmail.com,
steve@abita.co, sean@ashe.io, chjohnst@gmail.com, neelx@suse.com,
mproche@gmail.com, nick.lange@gmail.com,
marco.crivellari@suse.com, rishil1999@outlook.com,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v15 7/8] genirq/affinity: Restrict managed IRQ affinity to housekeeping CPUs
Date: Thu, 21 May 2026 19:29:55 -0400 [thread overview]
Message-ID: <20260521232956.553287-8-atomlin@atomlin.com> (raw)
In-Reply-To: <20260521232956.553287-1-atomlin@atomlin.com>
At present, the managed interrupt spreading algorithm distributes vectors
across all available CPUs within a given node or system. On systems
employing CPU isolation (e.g., "isolcpus=io_queue"), this behaviour
defeats the primary purpose of isolation by routing hardware interrupts
(such as NVMe completion queues) directly to isolated cores.
Update irq_create_affinity_masks() to respect the housekeeping CPU mask.
By passing the HK_TYPE_IO_QUEUE mask directly to the topological
distribution function (group_mask_cpus_evenly()), we ensure that managed
interrupts are kept strictly off isolated CPUs.
This patch additionally addresses the architectural constraints of
restricted vector distribution:
1. Vector Limits and Overrides: Updated irq_calc_affinity_vectors()
to strictly bound the maximum number of allocated vectors to the
weight of the housekeeping mask. This correctly overrides
drivers providing a calc_sets() callback, preventing them from
wasting memory on dead hardware queues that cannot be routed to
isolated CPUs.
2. Multi-set Alignment and Leak Prevention: When isolation
constraints result in fewer available masks than requested
vectors for a given set, the remaining vector slots are padded
with the housekeeping mask. This replaces the historical
irq_default_affinity padding, ensuring excess managed queues do
not leak interrupts onto isolated CPUs.
3. Minimum Vector Safety Net: To prevent fatal -ENOSPC device probe
aborts on heavily isolated systems (where the housekeeping CPU
count might be lower than a device's structural minimum), the
final vector calculation is safeguarded to never drop below
minvec. Queues will safely share the available housekeeping CPUs
instead of failing the probe.
4. Zero Overhead: The housekeeping mask is conditionally assigned
via a direct pointer, completely avoiding temporary mask
allocations (e.g., alloc_cpumask_var) and bitwise operations
when CPU isolation is disabled. This guarantees zero performance
or memory overhead for standard configurations.
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
kernel/irq/affinity.c | 31 +++++++++++++++++++++++--------
1 file changed, 23 insertions(+), 8 deletions(-)
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index 78f2418a8925..dade92f8b4b3 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -8,6 +8,7 @@
#include <linux/slab.h>
#include <linux/cpu.h>
#include <linux/group_cpus.h>
+#include <linux/sched/isolation.h>
static void default_calc_sets(struct irq_affinity *affd, unsigned int affvecs)
{
@@ -25,8 +26,10 @@ static void default_calc_sets(struct irq_affinity *affd, unsigned int affvecs)
struct irq_affinity_desc *
irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
{
- unsigned int affvecs, curvec, usedvecs, i;
+ unsigned int affvecs, curvec, usedvecs, i, j;
struct irq_affinity_desc *masks = NULL;
+ const struct cpumask *hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE);
+ bool hk_enabled = housekeeping_enabled(HK_TYPE_IO_QUEUE);
/*
* Determine the number of vectors which need interrupt affinities
@@ -70,19 +73,29 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
*/
for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) {
unsigned int nr_masks, this_vecs = affd->set_size[i];
- struct cpumask *result = group_cpus_evenly(this_vecs, &nr_masks);
+ struct cpumask *result;
+ const struct cpumask *mask;
+ if (hk_enabled)
+ mask = hk_mask;
+ else
+ mask = cpu_possible_mask;
+
+ result = group_mask_cpus_evenly(this_vecs, mask,
+ &nr_masks);
if (!result) {
kfree(masks);
return NULL;
}
-
- for (int j = 0; j < nr_masks; j++)
+ for (j = 0; j < nr_masks; j++)
cpumask_copy(&masks[curvec + j].mask, &result[j]);
+ for (j = nr_masks; j < this_vecs; j++)
+ cpumask_copy(&masks[curvec + j].mask, mask);
+
kfree(result);
- curvec += nr_masks;
- usedvecs += nr_masks;
+ curvec += this_vecs;
+ usedvecs += this_vecs;
}
/* Fill out vectors at the end that don't need affinity */
@@ -115,10 +128,12 @@ unsigned int irq_calc_affinity_vectors(unsigned int minvec, unsigned int maxvec,
if (resv > minvec)
return 0;
- if (affd->calc_sets)
+ if (housekeeping_enabled(HK_TYPE_IO_QUEUE))
+ set_vecs = cpumask_weight(housekeeping_cpumask(HK_TYPE_IO_QUEUE));
+ else if (affd->calc_sets)
set_vecs = maxvec - resv;
else
set_vecs = cpumask_weight(cpu_possible_mask);
- return resv + min(set_vecs, maxvec - resv);
+ return max(minvec, resv + min(set_vecs, maxvec - resv));
}
--
2.51.0
next prev parent reply other threads:[~2026-05-21 23:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-21 23:29 [PATCH v15 0/8] blk: honor isolcpus configuration Aaron Tomlin
2026-05-21 23:29 ` [PATCH v15 1/8] scsi: aacraid: use block layer helpers to calculate num of queues Aaron Tomlin
2026-05-21 23:29 ` [PATCH v15 2/8] lib/group_cpus: remove dead !SMP code Aaron Tomlin
2026-05-21 23:29 ` [PATCH v15 3/8] lib/group_cpus: Add group_mask_cpus_evenly() Aaron Tomlin
2026-05-21 23:29 ` [PATCH v15 4/8] isolation: Introduce io_queue isolcpus type Aaron Tomlin
2026-05-21 23:29 ` [PATCH v15 5/8] blk-mq: use hk cpus only when isolcpus=io_queue is enabled Aaron Tomlin
2026-05-21 23:29 ` [PATCH v15 6/8] blk-mq: prevent offlining hk CPUs with associated online isolated CPUs Aaron Tomlin
2026-05-21 23:29 ` Aaron Tomlin [this message]
2026-05-21 23:29 ` [PATCH v15 8/8] docs: add io_queue flag to isolcpus Aaron Tomlin
2026-05-26 16:05 ` [PATCH v15 0/8] blk: honor isolcpus configuration Daniel Wagner
2026-05-26 22:02 ` Aaron Tomlin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260521232956.553287-8-atomlin@atomlin.com \
--to=atomlin@atomlin.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=aacraid@microsemi.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bigeasy@linutronix.de \
--cc=chandrakanth.patil@broadcom.com \
--cc=chenridong@huawei.com \
--cc=chjohnst@gmail.com \
--cc=frederic@kernel.org \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jinpu.wang@cloud.ionos.com \
--cc=juri.lelli@redhat.com \
--cc=kashyap.desai@broadcom.com \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liyihang9@h-partners.com \
--cc=longman@redhat.com \
--cc=marco.crivellari@suse.com \
--cc=martin.petersen@oracle.com \
--cc=maz@kernel.org \
--cc=ming.lei@redhat.com \
--cc=mingo@redhat.com \
--cc=mproche@gmail.com \
--cc=mst@redhat.com \
--cc=neelx@suse.com \
--cc=nick.lange@gmail.com \
--cc=peterz@infradead.org \
--cc=ranjan.kumar@broadcom.com \
--cc=rishil1999@outlook.com \
--cc=ruanjinjie@huawei.com \
--cc=sagi@grimberg.me \
--cc=sathya.prakash@broadcom.com \
--cc=sean@ashe.io \
--cc=shivasharan.srikanteshwara@broadcom.com \
--cc=sreekanth.reddy@broadcom.com \
--cc=steve@abita.co \
--cc=suganath-prabu.subramani@broadcom.com \
--cc=sumit.saxena@broadcom.com \
--cc=tglx@kernel.org \
--cc=tom.leiming@gmail.com \
--cc=vincent.guittot@linaro.org \
--cc=wagi@kernel.org \
--cc=yphbchou0911@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.