From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>, Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
Mike Snitzer <snitzer@kernel.org>,
Jaegeuk Kim <jaegeuk@kernel.org>,
Jianchao Wang <jianchao.w.wang@oracle.com>
Subject: Re: [PATCH v3 2/7] block: Send requeued requests to the I/O scheduler
Date: Thu, 25 May 2023 08:06:31 +0900 [thread overview]
Message-ID: <bf32b0f9-d85b-367f-e6f4-83d58c418d7a@kernel.org> (raw)
In-Reply-To: <a40b10d9-4e30-438f-2509-28bb0df4a161@acm.org>
On 5/25/23 02:56, Bart Van Assche wrote:
> On 5/23/23 17:31, Ming Lei wrote:
>> On Tue, May 23, 2023 at 10:19:34AM -0700, Bart Van Assche wrote:
>>> The mq-deadline scheduler restricts the queue depth to one per zone for zoned
>>> storage so at any time there is at most one write command (REQ_OP_WRITE) in
>>> flight per zone.
>>
>> But if the write queue depth is 1 per zone, the requeued request won't
>> be re-ordered at all given no other write request can be issued from
>> scheduler in this zone before this requeued request is completed.
>>
>> So why bother to requeue the BLK_STS_RESOURCE request via scheduler?
>
> Hi Ming,
>
> It seems like my previous email was not clear enough. The mq-deadline
> scheduler restricts the queue depth per zone for commands passed to the
> SCSI core. It does not restrict how many requests a filesystem can
> submit per zone to the block layer. Without this patch there is a risk
> of reordering if a request is requeued, e.g. by the SCSI core, and other
> requests are pending for the same zone.
Yes there is, but the contract we established for zoned devices in the block
layer, from the start of the support, is that users *must* write sequentially.
The block layer does not attempt, generally speaking, to reorder requests.
When mq-deadline is used, the scheduler lba reordering *may* reorder writes,
thus hiding potential bugs in the user write sequence for a zone. That is fine.
However, once a write request is dispatched, we should keep the assumption that
it is a well formed one, namely directed at the zone write pointer. So any
consideration of requeue solving write ordering issues is moot to me.
Furthermore, when the requeue happens, the target zone is still locked and the
only write request that can be in flight for that target zones is that one being
requeued. Add to that the above assumption that the request is the one we must
dispatch first, there are absolutely zero chances of seeing a reordering happen
for writes to a particular zone. I really do not see the point of requeuing that
request through the IO scheduler at all.
In general, even for reads, requeuing through the scheduler is I think a really
bad idea as that can potentially significantly increase the request latency
(time to completion), with the user seeing long tail latencies. E.g. if the
request has high priority or a short CDL time limit, requeuing through the
scheduler will go against the user indicated urgency for that request and
degrade the effectivness of latency control easures such as IO priority and CDL.
Requeues should be at the head of the dispatch queue, not through the scheduler.
As long as we keep zone write locking for zoned devices, requeue to the head of
the dispatch queue is fine. But maybe this work is preparatory to removing zone
write locking ? If that is the case, I would like to see that as well to get the
big picture. Otherwise, the latency concerns I raised above are in my opinion, a
blocker for this change.
>
> Thanks,
>
> Bart.
>
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2023-05-24 23:07 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-22 18:38 [PATCH v3 0/7] Submit zoned writes in order Bart Van Assche
2023-05-22 18:38 ` [PATCH v3 1/7] block: Rename a local variable in blk_mq_requeue_work() Bart Van Assche
2023-05-23 7:12 ` Christoph Hellwig
2023-05-23 9:48 ` Johannes Thumshirn
2023-05-22 18:38 ` [PATCH v3 2/7] block: Send requeued requests to the I/O scheduler Bart Van Assche
2023-05-23 7:18 ` Christoph Hellwig
2023-05-23 22:30 ` Bart Van Assche
2023-05-24 6:13 ` Christoph Hellwig
2023-05-24 18:22 ` Bart Van Assche
2023-05-25 8:25 ` Christoph Hellwig
2023-05-23 9:03 ` Ming Lei
2023-05-23 17:19 ` Bart Van Assche
2023-05-24 0:31 ` Ming Lei
2023-05-24 17:56 ` Bart Van Assche
2023-05-24 23:06 ` Damien Le Moal [this message]
2023-05-25 0:53 ` Ming Lei
2023-06-21 0:34 ` Bart Van Assche
2023-06-22 23:45 ` Damien Le Moal
2023-06-23 20:31 ` Bart Van Assche
2023-05-22 18:38 ` [PATCH v3 3/7] block: Requeue requests if a CPU is unplugged Bart Van Assche
2023-05-23 7:19 ` Christoph Hellwig
2023-05-23 8:17 ` Ming Lei
2023-05-23 20:15 ` Bart Van Assche
2023-05-24 0:35 ` Ming Lei
2023-05-24 18:18 ` Bart Van Assche
2023-05-22 18:38 ` [PATCH v3 4/7] block: Make it easier to debug zoned write reordering Bart Van Assche
2023-05-23 7:19 ` Christoph Hellwig
2023-05-23 19:34 ` Bart Van Assche
2023-05-24 6:13 ` Christoph Hellwig
2023-05-24 18:25 ` Bart Van Assche
2023-05-22 18:38 ` [PATCH v3 5/7] block: Preserve the order of requeued requests Bart Van Assche
2023-05-22 18:38 ` [dm-devel] [PATCH v3 6/7] dm: Inline __dm_mq_kick_requeue_list() Bart Van Assche
2023-05-22 18:38 ` Bart Van Assche
2023-05-23 7:22 ` [dm-devel] " Christoph Hellwig
2023-05-23 7:22 ` Christoph Hellwig
2023-05-22 18:38 ` [dm-devel] [PATCH v3 7/7] block: Inline blk_mq_{, delay_}kick_requeue_list() Bart Van Assche
2023-05-22 18:38 ` [PATCH v3 7/7] block: Inline blk_mq_{,delay_}kick_requeue_list() Bart Van Assche
2023-05-24 8:01 ` [dm-devel] [PATCH v3 7/7] block: Inline blk_mq_{, delay_}kick_requeue_list() Vineeth Vijayan
2023-05-24 8:01 ` [PATCH v3 7/7] block: Inline blk_mq_{,delay_}kick_requeue_list() Vineeth Vijayan
2023-05-23 7:22 ` [PATCH v3 0/7] Submit zoned writes in order Christoph Hellwig
2023-05-23 20:04 ` Bart Van Assche
2023-05-24 6:15 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bf32b0f9-d85b-367f-e6f4-83d58c418d7a@kernel.org \
--to=dlemoal@kernel.org \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=hch@lst.de \
--cc=jaegeuk@kernel.org \
--cc=jianchao.w.wang@oracle.com \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.