All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aaron Tomlin <atomlin@atomlin.com>
To: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me,
	mst@redhat.com
Cc: atomlin@atomlin.com, aacraid@microsemi.com,
	James.Bottomley@HansenPartnership.com,
	martin.petersen@oracle.com, liyihang9@h-partners.com,
	kashyap.desai@broadcom.com, sumit.saxena@broadcom.com,
	shivasharan.srikanteshwara@broadcom.com,
	chandrakanth.patil@broadcom.com, sathya.prakash@broadcom.com,
	sreekanth.reddy@broadcom.com,
	suganath-prabu.subramani@broadcom.com, ranjan.kumar@broadcom.com,
	jinpu.wang@cloud.ionos.com, tglx@kernel.org, mingo@redhat.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, akpm@linux-foundation.org,
	maz@kernel.org, ruanjinjie@huawei.com, bigeasy@linutronix.de,
	yphbchou0911@gmail.com, wagi@kernel.org, frederic@kernel.org,
	longman@redhat.com, chenridong@huawei.com, hare@suse.de,
	kch@nvidia.com, ming.lei@redhat.com, tom.leiming@gmail.com,
	steve@abita.co, sean@ashe.io, chjohnst@gmail.com, neelx@suse.com,
	mproche@gmail.com, nick.lange@gmail.com,
	marco.crivellari@suse.com, rishil1999@outlook.com,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v13 6/8] blk-mq: prevent offlining hk CPUs with associated online isolated CPUs
Date: Tue, 12 May 2026 20:55:07 -0400	[thread overview]
Message-ID: <20260513005509.135966-7-atomlin@atomlin.com> (raw)
In-Reply-To: <20260513005509.135966-1-atomlin@atomlin.com>

From: Daniel Wagner <wagi@kernel.org>

When isolcpus=io_queue is enabled and the last housekeeping CPU
for a given hctx goes offline, no CPU would be left to handle I/O.
To prevent I/O stalls, disallow offlining housekeeping CPUs that are
still serving isolated CPUs.

Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
[atomlin:
    - Removed duplicate paragraph from commit message
    - Allow offlining of non-housekeeping CPUs
    - Fix logic flaw that prematurely rejected valid offline requests
    - Iterated over cpu_online_mask and manually reverse-mapped CPUs to
      correctly detect isolated CPUs, as blk_mq_map_swqueue()
      intentionally prunes them from hctx->cpumask
    - Prevented a TOCTOU NULL pointer dereference race against
      concurrent device teardown by using READ_ONCE() to fetch the disk
      pointer]
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
 block/blk-mq.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4c5c16cce4f8..afe0c0bf7e8a 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3720,6 +3720,57 @@ static bool blk_mq_hctx_has_requests(struct blk_mq_hw_ctx *hctx)
 	return data.has_rq;
 }
 
+static bool blk_mq_hctx_can_offline_hk_cpu(struct blk_mq_hw_ctx *hctx,
+					   unsigned int this_cpu)
+{
+	const struct cpumask *hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE);
+	struct gendisk *disk;
+	int cpu, fallback_isolated_cpu = -1;
+
+	/*
+	 * If the CPU being offlined is not a housekeeping CPU,
+	 * offlining it will not strand isolated CPUs. Allow it.
+	 */
+	if (!cpumask_test_cpu(this_cpu, hk_mask))
+		return true;
+	/*
+	 * Iterate over all online CPUs and manually check their mapping.
+	 * We cannot use hctx->cpumask here because blk_mq_map_swqueue()
+	 * intentionally strips isolated CPUs from it to prevent kworker
+	 * routing.
+	 */
+	for_each_online_cpu(cpu) {
+		struct blk_mq_hw_ctx *h;
+
+		if (cpu == this_cpu)
+			continue;
+
+		h = blk_mq_map_queue_type(hctx->queue, hctx->type, cpu);
+		if (h != hctx)
+			continue;
+
+		if (cpumask_test_cpu(cpu, hk_mask))
+			return true;
+
+		if (fallback_isolated_cpu == -1)
+			fallback_isolated_cpu = cpu;
+	}
+
+	if (fallback_isolated_cpu != -1) {
+		/*
+		 * Use READ_ONCE() to prevent compiler double-fetch TOCTOU
+		 * issues if the disk is removed concurrently.
+		 */
+		disk = READ_ONCE(hctx->queue->disk);
+		pr_warn("%s: trying to offline hctx%d but online isolated CPU %d is still mapped to it\n",
+			disk ? disk->disk_name : "?", hctx->queue_num,
+			fallback_isolated_cpu);
+		return false;
+	}
+
+	return true;
+}
+
 static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx,
 		unsigned int this_cpu)
 {
@@ -3752,6 +3803,11 @@ static int blk_mq_hctx_notify_offline(unsigned int cpu, struct hlist_node *node)
 			struct blk_mq_hw_ctx, cpuhp_online);
 	int ret = 0;
 
+	if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) {
+		if (!blk_mq_hctx_can_offline_hk_cpu(hctx, cpu))
+			return -EINVAL;
+	}
+
 	if (!hctx->nr_ctx || blk_mq_hctx_has_online_cpu(hctx, cpu))
 		return 0;
 
-- 
2.51.0


  parent reply	other threads:[~2026-05-13  0:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13  0:55 [PATCH v13 0/8] blk: honor isolcpus configuration Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 1/8] scsi: aacraid: use block layer helpers to calculate num of queues Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 2/8] lib/group_cpus: remove dead !SMP code Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 3/8] lib/group_cpus: Add group_mask_cpus_evenly() Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 4/8] isolation: Introduce io_queue isolcpus type Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 5/8] blk-mq: use hk cpus only when isolcpus=io_queue is enabled Aaron Tomlin
     [not found]   ` <3af2cd18-1221-4ff6-aa7f-6dab74460eab@nitrogen.local>
2026-05-13 23:30     ` Aaron Tomlin
2026-05-14 10:42       ` Daniel Wagner
2026-05-13  0:55 ` Aaron Tomlin [this message]
2026-05-13  0:55 ` [PATCH v13 7/8] genirq/affinity: Restrict managed IRQ affinity to housekeeping CPUs Aaron Tomlin
2026-05-13  0:55 ` [PATCH v13 8/8] docs: add io_queue flag to isolcpus Aaron Tomlin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260513005509.135966-7-atomlin@atomlin.com \
    --to=atomlin@atomlin.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=aacraid@microsemi.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bigeasy@linutronix.de \
    --cc=chandrakanth.patil@broadcom.com \
    --cc=chenridong@huawei.com \
    --cc=chjohnst@gmail.com \
    --cc=frederic@kernel.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jinpu.wang@cloud.ionos.com \
    --cc=juri.lelli@redhat.com \
    --cc=kashyap.desai@broadcom.com \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liyihang9@h-partners.com \
    --cc=longman@redhat.com \
    --cc=marco.crivellari@suse.com \
    --cc=martin.petersen@oracle.com \
    --cc=maz@kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mingo@redhat.com \
    --cc=mproche@gmail.com \
    --cc=mst@redhat.com \
    --cc=neelx@suse.com \
    --cc=nick.lange@gmail.com \
    --cc=peterz@infradead.org \
    --cc=ranjan.kumar@broadcom.com \
    --cc=rishil1999@outlook.com \
    --cc=ruanjinjie@huawei.com \
    --cc=sagi@grimberg.me \
    --cc=sathya.prakash@broadcom.com \
    --cc=sean@ashe.io \
    --cc=shivasharan.srikanteshwara@broadcom.com \
    --cc=sreekanth.reddy@broadcom.com \
    --cc=steve@abita.co \
    --cc=suganath-prabu.subramani@broadcom.com \
    --cc=sumit.saxena@broadcom.com \
    --cc=tglx@kernel.org \
    --cc=tom.leiming@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=wagi@kernel.org \
    --cc=yphbchou0911@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.