Re: [PATCH v8 10/12] blk-mq: use hk cpus only when isolcpus=io_queue is enabled

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Thomas Gleixner <tglx@linutronix.de>
To: Daniel Wagner <dwagner@suse.de>
Cc: Hannes Reinecke <hare@suse.de>, Daniel Wagner <wagi@kernel.org>,
	Jens Axboe <axboe@kernel.dk>, Keith Busch <kbusch@kernel.org>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Aaron Tomlin <atomlin@atomlin.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Costa Shulyupin <costa.shul@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Waiman Long <llong@redhat.com>, Ming Lei <ming.lei@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Mel Gorman <mgorman@suse.de>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com,
	linux-scsi@vger.kernel.org, storagedev@microchip.com,
	virtualization@lists.linux.dev,
	GR-QLogic-Storage-Upstream@marvell.com
Subject: Re: [PATCH v8 10/12] blk-mq: use hk cpus only when isolcpus=io_queue is enabled
Date: Fri, 12 Sep 2025 16:31:55 +0200	[thread overview]
Message-ID: <87cy7vrbc4.ffs@tglx> (raw)
In-Reply-To: <bc5ebdea-7091-4999-a021-ec2a65573aa0@flourine.local>

On Fri, Sep 12 2025 at 10:32, Daniel Wagner wrote:
> On Wed, Sep 10, 2025 at 10:20:26AM +0200, Thomas Gleixner wrote:
>> > The cpu_online_mask might change over time, it's not a static bitmap.
>> > Thus it's necessary to update the blk_hk_online_mask. Doing some sort of
>> > caching is certainly possible. Given that we have plenty of cpumask
>> > logic operation in the cpu_group_evenly code path later, I am not so
>> > sure this really makes a huge difference.
>> 
>> Sure,  but none of this is serialized against CPU hotplug operations. So
>> the resulting mask, which is handed into the spreading code can be
>> concurrently modified. IOW it's not as const as the code claims.
>
> Thanks for explaining.
>
> In group_cpu_evenly:
>
> 	/*
> 	 * Make a local cache of 'cpu_present_mask', so the two stages
> 	 * spread can observe consistent 'cpu_present_mask' without holding
> 	 * cpu hotplug lock, then we can reduce deadlock risk with cpu
> 	 * hotplug code.
> 	 *
> 	 * Here CPU hotplug may happen when reading `cpu_present_mask`, and
> 	 * we can live with the case because it only affects that hotplug
> 	 * CPU is handled in the 1st or 2nd stage, and either way is correct
> 	 * from API user viewpoint since 2-stage spread is sort of
> 	 * optimization.
> 	 */
> 	cpumask_copy(npresmsk, data_race(cpu_present_mask));

The present mask is very different from the online mask. The present
mask only changes on physical hotplug when:

     - a offline CPU is removed from the present set of CPUs

     - a offline CPU is added to it.

In neither case the CPU can be involved in any operation related to the
actual offline/online operations.

Also contrary to your approach, this code takes the possibility of
a concurrently changing mask into account by taking a racy snapshot,
which is immutable for the following operation.

What you are doing with that static mask, makes it a target of
concurrent modification, which is obviously a recipe for subtle bugs.

>   Turns out the two stage spread just needs consistent 'cpu_present_mask',
>   and remove the CPU hotplug lock by storing it into one local cache.  This
>   way doesn't change correctness, because all CPUs are still covered.
>
> This sounds like I should do something similar with cpu_online_mask.

Indeed.

Thanks,

        tglx

next prev parent reply	other threads:[~2025-09-12 14:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 14:59 [PATCH v8 00/12] blk: honor isolcpus configuration Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 01/12] scsi: aacraid: use block layer helpers to calculate num of queues Daniel Wagner
2025-09-08  6:06   ` Hannes Reinecke
2025-09-05 14:59 ` [PATCH v8 02/12] lib/group_cpus: remove dead !SMP code Daniel Wagner
2025-09-08  6:06   ` Hannes Reinecke
2025-09-05 14:59 ` [PATCH v8 03/12] lib/group_cpus: Add group_mask_cpus_evenly() Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 04/12] genirq/affinity: Add cpumask to struct irq_affinity Daniel Wagner
2025-09-10  8:22   ` Thomas Gleixner
2025-09-05 14:59 ` [PATCH v8 05/12] blk-mq: add blk_mq_{online|possible}_queue_affinity Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 06/12] nvme-pci: use block layer helpers to constrain queue affinity Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 07/12] scsi: Use " Daniel Wagner
2025-09-08  6:08   ` Hannes Reinecke
2025-09-05 14:59 ` [PATCH v8 08/12] virtio: blk/scsi: use " Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 09/12] isolation: Introduce io_queue isolcpus type Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 10/12] blk-mq: use hk cpus only when isolcpus=io_queue is enabled Daniel Wagner
2025-09-08  6:13   ` Hannes Reinecke
2025-09-08  7:26     ` Daniel Wagner
2025-09-08  7:51       ` Hannes Reinecke
2025-09-08  8:08         ` Daniel Wagner
2025-09-10  8:20       ` Thomas Gleixner
2025-09-12  8:32         ` Daniel Wagner
2025-09-12 14:31           ` Thomas Gleixner [this message]
2025-09-08  7:36   ` Daniel Wagner
2025-09-08 13:05     ` Daniel Wagner
2025-09-10  6:05   ` kernel test robot
2025-09-05 14:59 ` [PATCH v8 11/12] blk-mq: prevent offlining hk CPUs with associated online isolated CPUs Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 12/12] docs: add io_queue flag to isolcpus Daniel Wagner
2026-03-25 17:56 ` [PATCH v8 00/12] blk: honor isolcpus configuration Sebastian Andrzej Siewior
2026-03-26  7:42   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cy7vrbc4.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=GR-QLogic-Storage-Upstream@marvell.com \
    --cc=atomlin@atomlin.com \
    --cc=axboe@kernel.dk \
    --cc=costa.shul@redhat.com \
    --cc=dwagner@suse.de \
    --cc=frederic@kernel.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=juri.lelli@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=llong@redhat.com \
    --cc=martin.petersen@oracle.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=megaraidlinux.pdl@broadcom.com \
    --cc=mgorman@suse.de \
    --cc=ming.lei@redhat.com \
    --cc=mst@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=storagedev@microchip.com \
    --cc=virtualization@lists.linux.dev \
    --cc=vschneid@redhat.com \
    --cc=wagi@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.