From: Ming Lei <tom.leiming@gmail.com>
To: Aaron Tomlin <atomlin@atomlin.com>
Cc: Ming Lei <ming.lei@redhat.com>,
axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me,
mst@redhat.com, aacraid@microsemi.com,
James.Bottomley@hansenpartnership.com,
martin.petersen@oracle.com, liyihang9@h-partners.com,
kashyap.desai@broadcom.com, sumit.saxena@broadcom.com,
shivasharan.srikanteshwara@broadcom.com,
chandrakanth.patil@broadcom.com, sathya.prakash@broadcom.com,
sreekanth.reddy@broadcom.com,
suganath-prabu.subramani@broadcom.com, ranjan.kumar@broadcom.com,
jinpu.wang@cloud.ionos.com, tglx@kernel.org, mingo@redhat.com,
peterz@infradead.org, juri.lelli@redhat.com,
vincent.guittot@linaro.org, akpm@linux-foundation.org,
maz@kernel.org, ruanjinjie@huawei.com, bigeasy@linutronix.de,
yphbchou0911@gmail.com, wagi@kernel.org, frederic@kernel.org,
longman@redhat.com, chenridong@huawei.com, hare@suse.de,
kch@nvidia.com, steve@abita.co, sean@ashe.io, chjohnst@gmail.com,
neelx@suse.com, mproche@gmail.com, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, virtualization@lists.linux.dev,
linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
megaraidlinux.pdl@broadcom.com, mpi3mr-linuxdrv.pdl@broadcom.com,
MPT-FusionLinux.pdl@broadcom.com
Subject: Re: [PATCH v10 13/13] docs: add io_queue flag to isolcpus
Date: Mon, 13 Apr 2026 23:11:15 +0800 [thread overview]
Message-ID: <ad0Hk48y5JEeMlFk@fedora> (raw)
In-Reply-To: <6glgsbk2djsz4cqtbp2ht4274dw4rveq6fojlnpnuvx6zmpjxw@i43jo2l4qlz4>
On Sun, Apr 12, 2026 at 06:50:33PM -0400, Aaron Tomlin wrote:
> On Sat, Apr 11, 2026 at 08:52:00PM +0800, Ming Lei wrote:
> > > The critical issue lies at the invocation of group_cpus_evenly(). Without
> > > this patchset, the core logic lacks the necessary constraints to respect
> > > CPU isolation. It is entirely possible, and indeed happens in practice, for
> > > an isolated CPU to be assigned to a CPU mask group.
> >
> > It is one bug report? No, because it doesn't show any trouble from user
> > viewpoint.
>
> Hi Ming,
>
> The lack of a formal bug report does not negate the fact that the current
> behaviour silently breaks the fundamental contract of CPU isolation from
> the administrator's perspective.
>
> To illustrate the user-visible impact, the following demonstrates the
> difference between relying on isolcpus=managed_irq and isolcpus=io_queue
> under 7.0.0-rc3-00065-gd80965e205a5, which includes this series.
>
> The Broadcom MPI3 Storage Controller driver allocates a full complement of
> 48 operational queue pairs. Consequently, a number of MSI-X vectors are
> generated and mapped directly onto the isolated cores thereby breaching
> isolation.
>
> # uname -r
> 7.0.0-rc3-00065-gd80965e205a5
>
> # tr ' ' '\n' < /proc/cmdline | grep isolcpus=
> isolcpus=managed_irq,domain,2-47
>
> # cat /sys/devices/system/cpu/isolated
> 2-47
>
> # dmesg | grep -A 6 'MSI-X vectors supported:'
> [ 2.981705] mpi3mr0: MSI-X vectors supported: 128, no of cores: 48,
> [ 2.981705] mpi3mr0: MSI-X vectors requested: 49 poll_queues 0
> [ 3.001915] mpi3mr0: trying to create 48 operational queue pairs
> [ 3.011214] mpi3mr0: allocating operational queues through segmented queues
> [ 3.101903] mpi3mr0: successfully created 48 operational queue pairs(default/polled) queue = (2/0)
> [ 3.111468] mpi3mr0: controller initialization completed successfully
>
> # awk '/mpi3mr0/ { print $1" "$NF }' /proc/interrupts
> 78: mpi3mr0-msix0
> 79: mpi3mr0-msix1
> 80: mpi3mr0-msix2
> 81: mpi3mr0-msix3
> 82: mpi3mr0-msix4
> 83: mpi3mr0-msix5
> 84: mpi3mr0-msix6
> 85: mpi3mr0-msix7
> 86: mpi3mr0-msix8
> 87: mpi3mr0-msix9
> 88: mpi3mr0-msix10
> 89: mpi3mr0-msix11
> 90: mpi3mr0-msix12
> ...
> 122: mpi3mr0-msix44
> 123: mpi3mr0-msix45
> 124: mpi3mr0-msix46
> 125: mpi3mr0-msix47
> 126: mpi3mr0-msix48
>
> # grep -H '' /proc/irq/{119,120,121,122}/{effective,smp}_affinity_list
> /proc/irq/119/effective_affinity_list:42
> /proc/irq/119/smp_affinity_list:42
> /proc/irq/120/effective_affinity_list:43
> /proc/irq/120/smp_affinity_list:43
> /proc/irq/121/effective_affinity_list:44
> /proc/irq/121/smp_affinity_list:44
> /proc/irq/122/effective_affinity_list:45
> /proc/irq/122/smp_affinity_list:45
But typical applications aren't supposed to submit IOs from these isolated CPUs, so
in reality, it isn't a big deal.
>
>
> Now with isolcpus=io_queue,2-47 the allocation is structurally restricted
> at the source. The driver creates only two operational queues, confining
> all resulting interrupts exclusively to housekeeping CPUs (0 and 1):
>
> # uname -r
> 7.0.0-rc3-00065-gd80965e205a5
>
> # tr ' ' '\n' < /proc/cmdline | grep isolcpus=
> isolcpus=io_queue,domain,2-47
>
> # cat /sys/devices/system/cpu/isolated
> 2-47
>
> # dmesg | grep -A 6 'MSI-X vectors supported:'
> [ 3.284850] mpi3mr0: MSI-X vectors supported: 128, no of cores: 48,
> [ 3.284851] mpi3mr0: MSI-X vectors requested: 49 poll_queues 0
> [ 3.305492] mpi3mr0: allocated vectors (3) are less than configured (49)
> [ 3.316528] mpi3mr0: trying to create 2 operational queue pairs
> [ 3.328013] mpi3mr0: allocating operational queues through segmented queues
> [ 3.340697] mpi3mr0: successfully created 2 operational queue pairs(default/polled) queue = (2/0)
> [ 3.350664] mpi3mr0: controller initialization completed successfully
>
> # awk '/mpi3mr0/ { print $1" "$NF }' /proc/interrupts
> 79: mpi3mr0-msix0
> 80: mpi3mr0-msix1
> 81: mpi3mr0-msix2
>
> # grep -H '' /proc/irq/{79,80,81}/{effective,smp}_affinity_list
> /proc/irq/79/effective_affinity_list:1
> /proc/irq/79/smp_affinity_list:1
> /proc/irq/80/effective_affinity_list:1
> /proc/irq/80/smp_affinity_list:1
> /proc/irq/81/effective_affinity_list:0
> /proc/irq/81/smp_affinity_list:0
>
> > Sebastian explains/shows how "isolcpus=managed_irq" works perfectly in the
> > following link:
> >
> > https://lore.kernel.org/all/20260401110232.ET5RxZfl@linutronix.de/
> >
> > You have reviewed it...
> >
> > What matters is that IO won't interrupt isolated CPU.
>
> The isolcpus=managed_irq acts as a "best effort" avoidance algorithm rather
> than a strict, unbreakable constraint. This is indicated in the proposed
> changes to Documentation/core-api/irq/managed_irq.rst [1].
Yes, it is "best effort", but isolated cpu is only take as effective CPU
for the hw queue's irq iff all others are offline. Which is just fine for typical
use cases, in which IO isn't submitted from isolated CPU.
Thanks,
Ming
prev parent reply other threads:[~2026-04-13 15:11 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 22:22 [PATCH v10 00/13] blk: honor isolcpus configuration Aaron Tomlin
2026-04-01 22:23 ` [PATCH v10 01/13] scsi: aacraid: use block layer helpers to calculate num of queues Aaron Tomlin
2026-04-03 1:43 ` Martin K. Petersen
2026-04-01 22:23 ` [PATCH v10 02/13] lib/group_cpus: remove dead !SMP code Aaron Tomlin
2026-04-03 1:45 ` Martin K. Petersen
2026-04-01 22:23 ` [PATCH v10 03/13] lib/group_cpus: Add group_mask_cpus_evenly() Aaron Tomlin
2026-04-01 22:23 ` [PATCH v10 04/13] genirq/affinity: Add cpumask to struct irq_affinity Aaron Tomlin
2026-04-01 22:23 ` [PATCH v10 05/13] blk-mq: add blk_mq_{online|possible}_queue_affinity Aaron Tomlin
2026-04-01 22:23 ` [PATCH v10 06/13] nvme-pci: use block layer helpers to constrain queue affinity Aaron Tomlin
2026-04-03 1:46 ` Martin K. Petersen
2026-04-01 22:23 ` [PATCH v10 07/13] scsi: Use " Aaron Tomlin
2026-04-03 1:46 ` Martin K. Petersen
2026-04-01 22:23 ` [PATCH v10 08/13] virtio: blk/scsi: use " Aaron Tomlin
2026-04-03 1:47 ` Martin K. Petersen
2026-04-01 22:23 ` [PATCH v10 09/13] isolation: Introduce io_queue isolcpus type Aaron Tomlin
2026-04-03 1:47 ` Martin K. Petersen
2026-04-01 22:23 ` [PATCH v10 10/13] blk-mq: use hk cpus only when isolcpus=io_queue is enabled Aaron Tomlin
2026-04-03 2:06 ` Waiman Long
2026-04-05 23:09 ` Aaron Tomlin
2026-04-01 22:23 ` [PATCH v10 11/13] blk-mq: prevent offlining hk CPUs with associated online isolated CPUs Aaron Tomlin
2026-04-01 22:23 ` [PATCH v10 12/13] genirq/affinity: Restrict managed IRQ affinity to housekeeping CPUs Aaron Tomlin
2026-04-01 22:23 ` [PATCH v10 13/13] docs: add io_queue flag to isolcpus Aaron Tomlin
2026-04-03 2:30 ` Ming Lei
2026-04-06 1:15 ` Aaron Tomlin
2026-04-06 3:29 ` Ming Lei
2026-04-08 15:58 ` Aaron Tomlin
2026-04-09 15:00 ` Ming Lei
2026-04-10 1:45 ` Aaron Tomlin
2026-04-10 2:44 ` Ming Lei
2026-04-10 19:31 ` Aaron Tomlin
2026-04-11 12:52 ` Ming Lei
2026-04-12 22:50 ` Aaron Tomlin
2026-04-13 15:11 ` Ming Lei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ad0Hk48y5JEeMlFk@fedora \
--to=tom.leiming@gmail.com \
--cc=James.Bottomley@hansenpartnership.com \
--cc=MPT-FusionLinux.pdl@broadcom.com \
--cc=aacraid@microsemi.com \
--cc=akpm@linux-foundation.org \
--cc=atomlin@atomlin.com \
--cc=axboe@kernel.dk \
--cc=bigeasy@linutronix.de \
--cc=chandrakanth.patil@broadcom.com \
--cc=chenridong@huawei.com \
--cc=chjohnst@gmail.com \
--cc=frederic@kernel.org \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jinpu.wang@cloud.ionos.com \
--cc=juri.lelli@redhat.com \
--cc=kashyap.desai@broadcom.com \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=liyihang9@h-partners.com \
--cc=longman@redhat.com \
--cc=martin.petersen@oracle.com \
--cc=maz@kernel.org \
--cc=megaraidlinux.pdl@broadcom.com \
--cc=ming.lei@redhat.com \
--cc=mingo@redhat.com \
--cc=mpi3mr-linuxdrv.pdl@broadcom.com \
--cc=mproche@gmail.com \
--cc=mst@redhat.com \
--cc=neelx@suse.com \
--cc=peterz@infradead.org \
--cc=ranjan.kumar@broadcom.com \
--cc=ruanjinjie@huawei.com \
--cc=sagi@grimberg.me \
--cc=sathya.prakash@broadcom.com \
--cc=sean@ashe.io \
--cc=shivasharan.srikanteshwara@broadcom.com \
--cc=sreekanth.reddy@broadcom.com \
--cc=steve@abita.co \
--cc=suganath-prabu.subramani@broadcom.com \
--cc=sumit.saxena@broadcom.com \
--cc=tglx@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=virtualization@lists.linux.dev \
--cc=wagi@kernel.org \
--cc=yphbchou0911@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox