public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Daniel Wagner <wagi@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Keith Busch <kbusch@kernel.org>, Christoph Hellwig <hch@lst.de>,
	Sagi Grimberg <sagi@grimberg.me>,
	"Michael S. Tsirkin" <mst@redhat.com>
Cc: Aaron Tomlin <atomlin@atomlin.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Costa Shulyupin <costa.shul@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Waiman Long <llong@redhat.com>, Ming Lei <ming.lei@redhat.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Mel Gorman <mgorman@suse.de>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com,
	linux-scsi@vger.kernel.org, storagedev@microchip.com,
	virtualization@lists.linux.dev,
	GR-QLogic-Storage-Upstream@marvell.com
Subject: Re: [PATCH v8 10/12] blk-mq: use hk cpus only when isolcpus=io_queue is enabled
Date: Mon, 8 Sep 2025 08:13:31 +0200	[thread overview]
Message-ID: <ff66801c-f261-411d-bbbf-b386e013d096@suse.de> (raw)
In-Reply-To: <20250905-isolcpus-io-queues-v8-10-885984c5daca@kernel.org>

On 9/5/25 16:59, Daniel Wagner wrote:
> Extend the capabilities of the generic CPU to hardware queue (hctx)
> mapping code, so it maps houskeeping CPUs and isolated CPUs to the
> hardware queues evenly.
> 
> A hctx is only operational when there is at least one online
> housekeeping CPU assigned (aka active_hctx). Thus, check the final
> mapping that there is no hctx which has only offline housekeeing CPU and
> online isolated CPUs.
> 
> Example mapping result:
> 
>    16 online CPUs
> 
>    isolcpus=io_queue,2-3,6-7,12-13
> 
> Queue mapping:
>          hctx0: default 0 2
>          hctx1: default 1 3
>          hctx2: default 4 6
>          hctx3: default 5 7
>          hctx4: default 8 12
>          hctx5: default 9 13
>          hctx6: default 10
>          hctx7: default 11
>          hctx8: default 14
>          hctx9: default 15
> 
> IRQ mapping:
>          irq 42 affinity 0 effective 0  nvme0q0
>          irq 43 affinity 0 effective 0  nvme0q1
>          irq 44 affinity 1 effective 1  nvme0q2
>          irq 45 affinity 4 effective 4  nvme0q3
>          irq 46 affinity 5 effective 5  nvme0q4
>          irq 47 affinity 8 effective 8  nvme0q5
>          irq 48 affinity 9 effective 9  nvme0q6
>          irq 49 affinity 10 effective 10  nvme0q7
>          irq 50 affinity 11 effective 11  nvme0q8
>          irq 51 affinity 14 effective 14  nvme0q9
>          irq 52 affinity 15 effective 15  nvme0q10
> 
> A corner case is when the number of online CPUs and present CPUs
> differ and the driver asks for less queues than online CPUs, e.g.
> 
>    8 online CPUs, 16 possible CPUs
> 
>    isolcpus=io_queue,2-3,6-7,12-13
>    virtio_blk.num_request_queues=2
> 
> Queue mapping:
>          hctx0: default 0 1 2 3 4 5 6 7 8 12 13
>          hctx1: default 9 10 11 14 15
> 
> IRQ mapping
>          irq 27 affinity 0 effective 0 virtio0-config
>          irq 28 affinity 0-1,4-5,8 effective 5 virtio0-req.0
>          irq 29 affinity 9-11,14-15 effective 0 virtio0-req.1
> 
> Noteworthy is that for the normal/default configuration (!isoclpus) the
> mapping will change for systems which have non hyperthreading CPUs. The
> main assignment loop will completely rely that group_mask_cpus_evenly to
> do the right thing. The old code would distribute the CPUs linearly over
> the hardware context:
> 
> queue mapping for /dev/nvme0n1
>          hctx0: default 0 8
>          hctx1: default 1 9
>          hctx2: default 2 10
>          hctx3: default 3 11
>          hctx4: default 4 12
>          hctx5: default 5 13
>          hctx6: default 6 14
>          hctx7: default 7 15
> 
> The assign each hardware context the map generated by the
> group_mask_cpus_evenly function:
> 
> queue mapping for /dev/nvme0n1
>          hctx0: default 0 1
>          hctx1: default 2 3
>          hctx2: default 4 5
>          hctx3: default 6 7
>          hctx4: default 8 9
>          hctx5: default 10 11
>          hctx6: default 12 13
>          hctx7: default 14 15
> 
> In case of hyperthreading CPUs, the resulting map stays the same.
> 
> Signed-off-by: Daniel Wagner <wagi@kernel.org>
> ---
>   block/blk-mq-cpumap.c | 177 ++++++++++++++++++++++++++++++++++++++++++++------
>   1 file changed, 158 insertions(+), 19 deletions(-)
> 
> diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
> index 8244ecf878358c0b8de84458dcd5100c2f360213..1e66882e4d5bd9f78d132f3a229a1577853f7a9f 100644
> --- a/block/blk-mq-cpumap.c
> +++ b/block/blk-mq-cpumap.c
> @@ -17,12 +17,25 @@
>   #include "blk.h"
>   #include "blk-mq.h"
>   
> +static struct cpumask blk_hk_online_mask;
> +
>   static unsigned int blk_mq_num_queues(const struct cpumask *mask,
>   				      unsigned int max_queues)
>   {
>   	unsigned int num;
>   
> -	num = cpumask_weight(mask);
> +	if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) {
> +		const struct cpumask *hk_mask;
> +		struct cpumask avail_mask;
> +
> +		hk_mask = housekeeping_cpumask(HK_TYPE_IO_QUEUE);
> +		cpumask_and(&avail_mask, mask, hk_mask);
> +
> +		num = cpumask_weight(&avail_mask);
> +	} else {
> +		num = cpumask_weight(mask);
> +	}
> +
>   	return min_not_zero(num, max_queues);
>   }
>   
> @@ -31,9 +44,13 @@ static unsigned int blk_mq_num_queues(const struct cpumask *mask,
>    *
>    * Returns an affinity mask that represents the queue-to-CPU mapping
>    * requested by the block layer based on possible CPUs.
> + * This helper takes isolcpus settings into account.
>    */
>   const struct cpumask *blk_mq_possible_queue_affinity(void)
>   {
> +	if (housekeeping_enabled(HK_TYPE_IO_QUEUE))
> +		return housekeeping_cpumask(HK_TYPE_IO_QUEUE);
> +
>   	return cpu_possible_mask;
>   }
>   EXPORT_SYMBOL_GPL(blk_mq_possible_queue_affinity);
> @@ -46,6 +63,12 @@ EXPORT_SYMBOL_GPL(blk_mq_possible_queue_affinity);
>    */
>   const struct cpumask *blk_mq_online_queue_affinity(void)
>   {
> +	if (housekeeping_enabled(HK_TYPE_IO_QUEUE)) {
> +		cpumask_and(&blk_hk_online_mask, cpu_online_mask,
> +			    housekeeping_cpumask(HK_TYPE_IO_QUEUE));
> +		return &blk_hk_online_mask;

Can you explain the use of 'blk_hk_online_mask'?
Why is a static variable?
To my untrained eye it's being recalculated every time one calls
this function. And only the first invocation run on an empty mask,
all subsequent ones see a populated mask.

Is that the intention?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

  reply	other threads:[~2025-09-08  6:13 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-05 14:59 [PATCH v8 00/12] blk: honor isolcpus configuration Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 01/12] scsi: aacraid: use block layer helpers to calculate num of queues Daniel Wagner
2025-09-08  6:06   ` Hannes Reinecke
2025-09-05 14:59 ` [PATCH v8 02/12] lib/group_cpus: remove dead !SMP code Daniel Wagner
2025-09-08  6:06   ` Hannes Reinecke
2025-09-05 14:59 ` [PATCH v8 03/12] lib/group_cpus: Add group_mask_cpus_evenly() Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 04/12] genirq/affinity: Add cpumask to struct irq_affinity Daniel Wagner
2025-09-10  8:22   ` Thomas Gleixner
2025-09-05 14:59 ` [PATCH v8 05/12] blk-mq: add blk_mq_{online|possible}_queue_affinity Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 06/12] nvme-pci: use block layer helpers to constrain queue affinity Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 07/12] scsi: Use " Daniel Wagner
2025-09-08  6:08   ` Hannes Reinecke
2025-09-05 14:59 ` [PATCH v8 08/12] virtio: blk/scsi: use " Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 09/12] isolation: Introduce io_queue isolcpus type Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 10/12] blk-mq: use hk cpus only when isolcpus=io_queue is enabled Daniel Wagner
2025-09-08  6:13   ` Hannes Reinecke [this message]
2025-09-08  7:26     ` Daniel Wagner
2025-09-08  7:51       ` Hannes Reinecke
2025-09-08  8:08         ` Daniel Wagner
2025-09-10  8:20       ` Thomas Gleixner
2025-09-12  8:32         ` Daniel Wagner
2025-09-12 14:31           ` Thomas Gleixner
2025-09-08  7:36   ` Daniel Wagner
2025-09-08 13:05     ` Daniel Wagner
2025-09-10  6:05   ` kernel test robot
2025-09-05 14:59 ` [PATCH v8 11/12] blk-mq: prevent offlining hk CPUs with associated online isolated CPUs Daniel Wagner
2025-09-05 14:59 ` [PATCH v8 12/12] docs: add io_queue flag to isolcpus Daniel Wagner
2026-03-25 17:56 ` [PATCH v8 00/12] blk: honor isolcpus configuration Sebastian Andrzej Siewior
2026-03-26  7:42   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff66801c-f261-411d-bbbf-b386e013d096@suse.de \
    --to=hare@suse.de \
    --cc=GR-QLogic-Storage-Upstream@marvell.com \
    --cc=atomlin@atomlin.com \
    --cc=axboe@kernel.dk \
    --cc=costa.shul@redhat.com \
    --cc=frederic@kernel.org \
    --cc=hch@lst.de \
    --cc=juri.lelli@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=llong@redhat.com \
    --cc=martin.petersen@oracle.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=megaraidlinux.pdl@broadcom.com \
    --cc=mgorman@suse.de \
    --cc=ming.lei@redhat.com \
    --cc=mst@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=storagedev@microchip.com \
    --cc=tglx@linutronix.de \
    --cc=virtualization@lists.linux.dev \
    --cc=vschneid@redhat.com \
    --cc=wagi@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox