From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, John Garry <john.garry@huawei.com>,
Hannes Reinecke <hare@suse.com>, Christoph Hellwig <hch@lst.de>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH V10 07/11] blk-mq: stop to handle IO and drain IO before hctx becomes inactive
Date: Sat, 9 May 2020 12:10:42 +0800 [thread overview]
Message-ID: <20200509041042.GG1392681@T590> (raw)
In-Reply-To: <0f578345-5a51-b64a-e150-724cfb18dde4@acm.org>
On Fri, May 08, 2020 at 08:24:44PM -0700, Bart Van Assche wrote:
> On 2020-05-08 19:20, Ming Lei wrote:
> > Not sure why you mention queue freezing.
>
> This patch series introduces a fundamental race between modifying the
> hardware queue state (BLK_MQ_S_INACTIVE) and tag allocation. The only
Basically there are two cases:
1) setting BLK_MQ_S_INACTIVE and driver tag allocation are run on same
CPU, we just need a compiler barrier, that happens most of times
2) setting BLK_MQ_S_INACTIVE and driver tag allocation are run on
different CPUs, then one pair of smp_mb() is applied for avoiding
out of order, that only happens in case of direct issue process migration.
Please take a look at the comment in this patch:
+ /*
+ * In case that direct issue IO process is migrated to other CPU
+ * which may not belong to this hctx, add one memory barrier so we
+ * can order driver tag assignment and checking BLK_MQ_S_INACTIVE.
+ * Otherwise, barrier() is enough given both setting BLK_MQ_S_INACTIVE
+ * and driver tag assignment are run on the same CPU because
+ * BLK_MQ_S_INACTIVE is only set after the last CPU of this hctx is
+ * becoming offline.
+ *
+ * Process migration might happen after the check on current processor
+ * id, smp_mb() is implied by processor migration, so no need to worry
+ * about it.
+ */
And you may find more discussion about this topic in the following thread:
https://lore.kernel.org/linux-block/20200429134327.GC700644@T590/
> mechanism I know of for enforcing the order in which another thread
> observes writes to different memory locations without inserting a memory
> barrier in the hot path is RCU (see also The RCU-barrier menagerie;
> https://lwn.net/Articles/573497/). The only existing such mechanism in
> the blk-mq core I know of is queue freezing. Hence my comment about
> queue freezing.
You didn't explain how queue freezing is used for this issue.
We are talking about CPU hotplug vs. IO. In short, when one hctx becomes
inactive(all cpus in hctx->cpumask becomes offline), in-flight IO from
this hctx needs to be drained for avoiding io timeout. Also all requests
in scheduler/sw queue from this hctx needs to be handled correctly for
avoiding IO hang.
queue freezing can only be applied on the request queue level, and not
hctx level. When requests can't be completed, wait freezing just hangs
for-ever.
Thanks,
Ming
next prev parent reply other threads:[~2020-05-09 4:11 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-05 2:09 [PATCH V10 00/11] blk-mq: improvement CPU hotplug Ming Lei
2020-05-05 2:09 ` [PATCH V10 01/11] block: clone nr_integrity_segments and write_hint in blk_rq_prep_clone Ming Lei
2020-05-05 2:09 ` [PATCH V10 02/11] block: add helper for copying request Ming Lei
2020-05-05 2:09 ` [PATCH V10 03/11] blk-mq: mark blk_mq_get_driver_tag as static Ming Lei
2020-05-05 2:09 ` [PATCH V10 04/11] blk-mq: assign rq->tag in blk_mq_get_driver_tag Ming Lei
2020-05-05 2:09 ` [PATCH V10 05/11] blk-mq: support rq filter callback when iterating rqs Ming Lei
2020-05-08 23:32 ` Bart Van Assche
2020-05-09 0:18 ` Bart Van Assche
2020-05-09 2:05 ` Ming Lei
2020-05-09 3:08 ` Bart Van Assche
2020-05-09 3:52 ` Ming Lei
2020-05-05 2:09 ` [PATCH V10 06/11] blk-mq: prepare for draining IO when hctx's all CPUs are offline Ming Lei
2020-05-05 6:14 ` Hannes Reinecke
2020-05-08 23:26 ` Bart Van Assche
2020-05-09 2:09 ` Ming Lei
2020-05-09 3:11 ` Bart Van Assche
2020-05-09 3:56 ` Ming Lei
2020-05-05 2:09 ` [PATCH V10 07/11] blk-mq: stop to handle IO and drain IO before hctx becomes inactive Ming Lei
2020-05-08 23:39 ` Bart Van Assche
2020-05-09 2:20 ` Ming Lei
2020-05-09 3:24 ` Bart Van Assche
2020-05-09 4:10 ` Ming Lei [this message]
2020-05-09 14:18 ` Bart Van Assche
2020-05-11 1:45 ` Ming Lei
2020-05-11 3:20 ` Bart Van Assche
2020-05-11 3:48 ` Ming Lei
2020-05-11 20:56 ` Bart Van Assche
2020-05-12 1:25 ` Ming Lei
2020-05-05 2:09 ` [PATCH V10 08/11] block: add blk_end_flush_machinery Ming Lei
2020-05-05 2:09 ` [PATCH V10 09/11] blk-mq: add blk_mq_hctx_handle_dead_cpu for handling cpu dead Ming Lei
2020-05-05 2:09 ` [PATCH V10 10/11] blk-mq: re-submit IO in case that hctx is inactive Ming Lei
2020-05-05 2:09 ` [PATCH V10 11/11] block: deactivate hctx when the hctx is actually inactive Ming Lei
2020-05-09 14:07 ` Bart Van Assche
2020-05-11 2:11 ` Ming Lei
2020-05-11 3:30 ` Bart Van Assche
2020-05-11 4:08 ` Ming Lei
2020-05-11 20:52 ` Bart Van Assche
2020-05-12 1:43 ` Ming Lei
2020-05-12 2:08 ` Ming Lei
2020-05-08 21:49 ` [PATCH V10 00/11] blk-mq: improvement CPU hotplug Ming Lei
2020-05-09 3:17 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200509041042.GG1392681@T590 \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=john.garry@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).