Linux block layer
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Ming Lei <ming.lei@redhat.com>
Cc: linux-block@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	Bart Van Assche <bart.vanassche@sandisk.com>,
	Oleksandr Natalenko <oleksandr@natalenko.name>
Subject: Re: [PATCH 1/2] blk-mq: add requests in the tail of hctx->dispatch
Date: Wed, 30 Aug 2017 09:51:31 -0600	[thread overview]
Message-ID: <b2058354-f466-b1d4-1a55-6233ddd0f3ac@kernel.dk> (raw)
In-Reply-To: <20170830153929.GB14684@ming.t460p>

On 08/30/2017 09:39 AM, Ming Lei wrote:
> On Wed, Aug 30, 2017 at 09:22:42AM -0600, Jens Axboe wrote:
>> On 08/30/2017 09:19 AM, Ming Lei wrote:
>>> It is more reasonable to add requests to ->dispatch in way
>>> of FIFO style, instead of LIFO style.
>>>
>>> Also in this way, we can allow to insert request at the front
>>> of hw queue, which function is needed to fix one bug
>>> in blk-mq's implementation of blk_execute_rq()
>>>
>>> Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
>>> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>>  block/blk-mq-sched.c | 2 +-
>>>  block/blk-mq.c       | 2 +-
>>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
>>> index 4ab69435708c..8d97df40fc28 100644
>>> --- a/block/blk-mq-sched.c
>>> +++ b/block/blk-mq-sched.c
>>> @@ -272,7 +272,7 @@ static bool blk_mq_sched_bypass_insert(struct blk_mq_hw_ctx *hctx,
>>>  	 * the dispatch list.
>>>  	 */
>>>  	spin_lock(&hctx->lock);
>>> -	list_add(&rq->queuelist, &hctx->dispatch);
>>> +	list_add_tail(&rq->queuelist, &hctx->dispatch);
>>>  	spin_unlock(&hctx->lock);
>>>  	return true;
>>>  }
>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>> index 4603b115e234..fed3d0c16266 100644
>>> --- a/block/blk-mq.c
>>> +++ b/block/blk-mq.c
>>> @@ -1067,7 +1067,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list)
>>>  		blk_mq_put_driver_tag(rq);
>>>  
>>>  		spin_lock(&hctx->lock);
>>> -		list_splice_init(list, &hctx->dispatch);
>>> +		list_splice_tail_init(list, &hctx->dispatch);
>>>  		spin_unlock(&hctx->lock);
>>
>> I'm not convinced this is safe, there's actually a reason why the
>> request is added to the front and not the back. We do have
>> reorder_tags_to_front() as a safe guard, but I'd much rather get rid of
> 
> reorder_tags_to_front() is for reordering the requests in current list,
> this patch is for splicing list into hctx->dispatch, so I can't see
> it isn't safe, or could you explain it a bit?

If we can get the ordering right, then down the line we won't need to
have the tags reordering at all. It's an ugly hack that I'd love to see
go away.

>> that than make this change.
>>
>> What's your reasoning here? Your changelog doesn't really explain why
> 
> Firstly the 2nd patch need to add one rq(such as RQF_PM) to the
> front of the hw queue, the simple way is to add it to the front
> of hctx->dispatch. Without this change, the 2nd patch can't work
> at all.
> 
> Secondly this way is still reasonable:
> 
> 	- one rq is added to hctx->dispatch because queue is busy
> 	- another rq is added to hctx->dispatch too because of same reason
>
> so it is reasonable to to add list into hctx->dispatch in FIFO style.

Not disagreeing with the logic. But it also begs the question of why we
don't apply the same treatment to when we splice leftovers to the
dispatch list, currently we front splice that.

All I'm saying is that you need to tread very carefully with this, and
throw it through some careful testing to ensure that we don't introduce
conditions that now livelock. NVMe is the easy test case, that will
generally always work since we never run out of tags. The problematic
test case is usually things like SATA with 31 tags, and especially SATA
with flushes that don't queue. One good test case is the one where you
end up having all tags (or almost all) consumed by flushes, and still
ensuring that we're making forward progress.

-- 
Jens Axboe

  reply	other threads:[~2017-08-30 15:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-30 15:19 [PATCH 0/2] blk-mq: fix I/O hang during system resume Ming Lei
2017-08-30 15:19 ` Ming Lei
2017-08-30 15:19 ` [PATCH 1/2] blk-mq: add requests in the tail of hctx->dispatch Ming Lei
2017-08-30 15:22   ` Jens Axboe
2017-08-30 15:39     ` Ming Lei
2017-08-30 15:51       ` Jens Axboe [this message]
2017-08-30 16:58         ` Ming Lei
2017-08-30 15:19 ` [PATCH 2/2] blk-mq: align to legacy's implementation of blk_execute_rq Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b2058354-f466-b1d4-1a55-6233ddd0f3ac@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=bart.vanassche@sandisk.com \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=oleksandr@natalenko.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox