From: Ming Lei <ming.lei@redhat.com>
To: Daniel Wagner <dwagner@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>, Keith Busch <kbusch@kernel.org>,
Sagi Grimberg <sagi@grimberg.me>,
Thomas Gleixner <tglx@linutronix.de>,
Christoph Hellwig <hch@lst.de>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
John Garry <john.g.garry@oracle.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>,
Kashyap Desai <kashyap.desai@broadcom.com>,
Sumit Saxena <sumit.saxena@broadcom.com>,
Shivasharan S <shivasharan.srikanteshwara@broadcom.com>,
Chandrakanth patil <chandrakanth.patil@broadcom.com>,
Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>,
Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>,
Nilesh Javali <njavali@marvell.com>,
GR-QLogic-Storage-Upstream@marvell.com,
Jonathan Corbet <corbet@lwn.net>,
Frederic Weisbecker <frederic@kernel.org>,
Mel Gorman <mgorman@suse.de>, Hannes Reinecke <hare@suse.de>,
Sridhar Balaraman <sbalaraman@parallelwireless.com>,
"brookxu.cn" <brookxu.cn@gmail.com>,
linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
virtualization@lists.linux.dev, megaraidlinux.pdl@broadcom.com,
mpi3mr-linuxdrv.pdl@broadcom.com,
MPT-FusionLinux.pdl@broadcom.com, storagedev@microchip.com,
linux-doc@vger.kernel.org, ming.lei@redhat.com
Subject: Re: [PATCH v3 15/15] blk-mq: use hk cpus only when isolcpus=io_queue is enabled
Date: Thu, 8 Aug 2024 13:26:41 +0800 [thread overview]
Message-ID: <ZrRXEUko5EwKJaaP@fedora> (raw)
In-Reply-To: <253ec223-98e1-4e7e-b138-0a83ea1a7b0e@flourine.local>
On Wed, Aug 07, 2024 at 02:40:11PM +0200, Daniel Wagner wrote:
> On Tue, Aug 06, 2024 at 10:55:09PM GMT, Ming Lei wrote:
> > On Tue, Aug 06, 2024 at 02:06:47PM +0200, Daniel Wagner wrote:
> > > When isolcpus=io_queue is enabled all hardware queues should run on the
> > > housekeeping CPUs only. Thus ignore the affinity mask provided by the
> > > driver. Also we can't use blk_mq_map_queues because it will map all CPUs
> > > to first hctx unless, the CPU is the same as the hctx has the affinity
> > > set to, e.g. 8 CPUs with isolcpus=io_queue,2-3,6-7 config
> >
> > What is the expected behavior if someone still tries to submit IO on isolated
> > CPUs?
>
> If a user thread is issuing an IO the IO is handled by the housekeeping
> CPU, which will cause some noise on the submitting CPU. As far I was
> told this is acceptable. Our customers really don't want to have any
> IO not from their application ever hitting the isolcpus. When their
> application is issuing an IO.
>
> > BTW, I don't see any change in blk_mq_get_ctx()/blk_mq_map_queue() in this
> > patchset,
>
> I was trying to figure out what you tried to explain last time with
> hangs, but didn't really understand what the conditions are for this
> problem to occur.
Isolated CPUs are removed from queue mapping in this patchset, when someone
submit IOs from the isolated CPU, what is the correct hctx used for handling
these IOs?
From current implementation, it depends on implied zero filled
tag_set->map[type].mq_map[isolated_cpu], so hctx 0 is used.
During CPU offline, in blk_mq_hctx_notify_offline(),
blk_mq_hctx_has_online_cpu() returns true even though the last cpu in
hctx 0 is offline because isolated cpus join hctx 0 unexpectedly, so IOs in
hctx 0 won't be drained.
However managed irq core code still shutdowns the hw queue's irq because all
CPUs in this hctx are offline now. Then IO hang is triggered, isn't it?
The current blk-mq takes static & global queue/CPUs mapping, in which all CPUs
are covered. This patchset removes isolated CPUs from the mapping, and the
change is big from viewpoint of blk-mq queue mapping.
>
> > that means one random hctx(or even NULL) may be used for submitting
> > IO from isolated CPUs,
> > then there can be io hang risk during cpu hotplug, or
> > kernel panic when submitting bio.
>
> Can you elaborate a bit more? I must miss something important here.
>
> Anyway, my understanding is that when the last CPU of a hctx goes
> offline the affinity is broken and assigned to an online HK CPU. And we
> ensure all flight IO have finished and also ensure we don't submit any
> new IO to a CPU which goes offline.
>
> FWIW, I tried really hard to get an IO hang with cpu hotplug.
Please see above.
thanks,
Ming
next prev parent reply other threads:[~2024-08-08 5:27 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-06 12:06 [PATCH v3 00/15] honor isolcpus configuration Daniel Wagner
2024-08-06 12:06 ` [PATCH v3 01/15] scsi: pm8001: do not overwrite PCI queue mapping Daniel Wagner
2024-08-06 13:24 ` Christoph Hellwig
2024-08-06 15:03 ` John Garry
2024-08-06 12:06 ` [PATCH v3 02/15] virito: add APIs for retrieving vq affinity Daniel Wagner
2024-08-06 13:25 ` Christoph Hellwig
2024-08-06 12:06 ` [PATCH v3 03/15] blk-mq: introduce blk_mq_dev_map_queues Daniel Wagner
2024-08-06 13:26 ` Christoph Hellwig
2024-08-07 12:49 ` Daniel Wagner
2024-08-12 9:05 ` Christoph Hellwig
2024-08-06 12:06 ` [PATCH v3 04/15] scsi: replace blk_mq_pci_map_queues with blk_mq_dev_map_queues Daniel Wagner
2024-08-12 9:06 ` Christoph Hellwig
2024-08-12 15:31 ` John Garry
2024-08-13 9:39 ` Daniel Wagner
2024-08-06 12:06 ` [PATCH v3 05/15] nvme: " Daniel Wagner
2024-08-12 9:06 ` Christoph Hellwig
2024-08-06 12:06 ` [PATCH v3 06/15] virtio: blk/scs: replace blk_mq_virtio_map_queues " Daniel Wagner
2024-08-12 9:07 ` Christoph Hellwig
2024-08-06 12:06 ` [PATCH v3 07/15] blk-mq: remove unused queue mapping helpers Daniel Wagner
2024-08-12 9:08 ` Christoph Hellwig
2024-08-13 9:41 ` Daniel Wagner
2024-08-06 12:06 ` [PATCH v3 08/15] sched/isolation: Add io_queue housekeeping option Daniel Wagner
2024-08-06 12:06 ` [PATCH v3 09/15] docs: add io_queue as isolcpus options Daniel Wagner
2024-08-06 12:06 ` [PATCH v3 10/15] blk-mq: add number of queue calc helper Daniel Wagner
2024-08-12 9:03 ` Christoph Hellwig
2024-08-06 12:06 ` [PATCH v3 11/15] nvme-pci: use block layer helpers to calculate num of queues Daniel Wagner
2024-08-12 9:04 ` Christoph Hellwig
2024-08-06 12:06 ` [PATCH v3 12/15] scsi: " Daniel Wagner
2024-08-12 9:09 ` Christoph Hellwig
2024-08-06 12:06 ` [PATCH v3 13/15] virtio: blk/scsi: " Daniel Wagner
2024-08-06 12:06 ` [PATCH v3 14/15] lib/group_cpus.c: honor housekeeping config when grouping CPUs Daniel Wagner
2024-08-06 14:47 ` Ming Lei
2024-08-12 9:09 ` Christoph Hellwig
2024-08-06 12:06 ` [PATCH v3 15/15] blk-mq: use hk cpus only when isolcpus=io_queue is enabled Daniel Wagner
2024-08-06 14:55 ` Ming Lei
2024-08-07 12:40 ` Daniel Wagner
2024-08-08 5:26 ` Ming Lei [this message]
2024-08-09 7:22 ` Daniel Wagner
2024-08-09 14:53 ` Ming Lei
2024-08-09 15:23 ` Ming Lei
2024-08-13 12:53 ` Daniel Wagner
2024-08-13 12:56 ` Ming Lei
2024-08-13 13:11 ` Daniel Wagner
2024-08-06 13:09 ` [PATCH v3 00/15] honor isolcpus configuration Stefan Hajnoczi
2024-08-07 12:25 ` Daniel Wagner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZrRXEUko5EwKJaaP@fedora \
--to=ming.lei@redhat.com \
--cc=GR-QLogic-Storage-Upstream@marvell.com \
--cc=MPT-FusionLinux.pdl@broadcom.com \
--cc=axboe@kernel.dk \
--cc=brookxu.cn@gmail.com \
--cc=chandrakanth.patil@broadcom.com \
--cc=corbet@lwn.net \
--cc=dwagner@suse.de \
--cc=frederic@kernel.org \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jasowang@redhat.com \
--cc=john.g.garry@oracle.com \
--cc=kashyap.desai@broadcom.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=megaraidlinux.pdl@broadcom.com \
--cc=mgorman@suse.de \
--cc=mpi3mr-linuxdrv.pdl@broadcom.com \
--cc=mst@redhat.com \
--cc=njavali@marvell.com \
--cc=sagi@grimberg.me \
--cc=sathya.prakash@broadcom.com \
--cc=sbalaraman@parallelwireless.com \
--cc=shivasharan.srikanteshwara@broadcom.com \
--cc=storagedev@microchip.com \
--cc=suganath-prabu.subramani@broadcom.com \
--cc=sumit.saxena@broadcom.com \
--cc=tglx@linutronix.de \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).