public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Ming Lei <ming.lei@redhat.com>
Cc: Yi Zhang <yi.zhang@redhat.com>,
	Steffen Maier <maier@linux.ibm.com>,
	linux-block <linux-block@vger.kernel.org>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: [bug report] WARNING: CPU: 1 PID: 1386 at block/blk-mq-sched.c:432 blk_mq_sched_insert_request+0x54/0x178
Date: Wed, 3 Nov 2021 09:41:03 -0600	[thread overview]
Message-ID: <30e8f85a-bc43-675f-6594-93c2c60ebd18@kernel.dk> (raw)
In-Reply-To: <YYKnv9VNR7NgdU5p@T590>

On 11/3/21 9:16 AM, Ming Lei wrote:
> On Wed, Nov 03, 2021 at 09:10:17AM -0600, Jens Axboe wrote:
>> On 11/3/21 9:03 AM, Jens Axboe wrote:
>>> On 11/3/21 8:57 AM, Ming Lei wrote:
>>>> On Wed, Nov 03, 2021 at 09:59:02PM +0800, Yi Zhang wrote:
>>>>> On Wed, Nov 3, 2021 at 7:59 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>>
>>>>>> On 11/2/21 9:54 PM, Jens Axboe wrote:
>>>>>>> On Nov 2, 2021, at 9:52 PM, Ming Lei <ming.lei@redhat.com> wrote:
>>>>>>>>
>>>>>>>> On Tue, Nov 02, 2021 at 09:21:10PM -0600, Jens Axboe wrote:
>>>>>>>>>> On 11/2/21 8:21 PM, Yi Zhang wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Can either one of you try with this patch? Won't fix anything, but it'll
>>>>>>>>>>>> hopefully shine a bit of light on the issue.
>>>>>>>>>>>>
>>>>>>>>>> Hi Jens
>>>>>>>>>>
>>>>>>>>>> Here is the full log:
>>>>>>>>>
>>>>>>>>> Thanks! I think I see what it could be - can you try this one as well,
>>>>>>>>> would like to confirm that the condition I think is triggering is what
>>>>>>>>> is triggering.
>>>>>>>>>
>>>>>>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>>>>>>> index 07eb1412760b..81dede885231 100644
>>>>>>>>> --- a/block/blk-mq.c
>>>>>>>>> +++ b/block/blk-mq.c
>>>>>>>>> @@ -2515,6 +2515,8 @@ void blk_mq_submit_bio(struct bio *bio)
>>>>>>>>>    if (plug && plug->cached_rq) {
>>>>>>>>>        rq = rq_list_pop(&plug->cached_rq);
>>>>>>>>>        INIT_LIST_HEAD(&rq->queuelist);
>>>>>>>>> +        WARN_ON_ONCE(q->elevator && !(rq->rq_flags & RQF_ELV));
>>>>>>>>> +        WARN_ON_ONCE(!q->elevator && (rq->rq_flags & RQF_ELV));
>>>>>>>>>    } else {
>>>>>>>>>        struct blk_mq_alloc_data data = {
>>>>>>>>>            .q        = q,
>>>>>>>>> @@ -2535,6 +2537,8 @@ void blk_mq_submit_bio(struct bio *bio)
>>>>>>>>>                bio_wouldblock_error(bio);
>>>>>>>>>            goto queue_exit;
>>>>>>>>>        }
>>>>>>>>> +        WARN_ON_ONCE(q->elevator && !(rq->rq_flags & RQF_ELV));
>>>>>>>>> +        WARN_ON_ONCE(!q->elevator && (rq->rq_flags & RQF_ELV));
>>>>>>>>
>>>>>>>> Hello Jens,
>>>>>>>>
>>>>>>>> I guess the issue could be the following code run without grabbing
>>>>>>>> ->q_usage_counter from blk_mq_alloc_request() and blk_mq_alloc_request_hctx().
>>>>>>>>
>>>>>>>> .rq_flags       = q->elevator ? RQF_ELV : 0,
>>>>>>>>
>>>>>>>> then elevator is switched to real one from none, and check on q->elevator
>>>>>>>> becomes not consistent.
>>>>>>>
>>>>>>> Indeed, that’s where I was going with this. I have a patch, testing it
>>>>>>> locally but it’s getting late. Will send it out tomorrow. The nice
>>>>>>> benefit is that it allows dropping the weird ref get on plug flush,
>>>>>>> and batches getting the refs as well.
>>>>>>
>>>>>> Yi/Steffen, can you try pulling this into your test kernel:
>>>>>>
>>>>>> git://git.kernel.dk/linux-block for-next
>>>>>>
>>>>>> and see if it fixes the issue for you. Thanks!
>>>>>
>>>>> It still can be reproduced with the latest linux-block/for-next, here is the log
>>>>>
>>>>> fab2914e46eb (HEAD, new/for-next) Merge branch 'for-5.16/drivers' into for-next
>>>>
>>>> Hi Yi,
>>>>
>>>> Please try the following change:
>>>>
>>>>
>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>> index e1e64964a31b..eb634a9c61ff 100644
>>>> --- a/block/blk-mq.c
>>>> +++ b/block/blk-mq.c
>>>> @@ -494,7 +494,6 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op,
>>>>  		.q		= q,
>>>>  		.flags		= flags,
>>>>  		.cmd_flags	= op,
>>>> -		.rq_flags	= q->elevator ? RQF_ELV : 0,
>>>>  		.nr_tags	= 1,
>>>>  	};
>>>>  	struct request *rq;
>>>> @@ -504,6 +503,7 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op,
>>>>  	if (ret)
>>>>  		return ERR_PTR(ret);
>>>>  
>>>> +	data.rq_flags	= q->elevator ? RQF_ELV : 0,
>>>>  	rq = __blk_mq_alloc_requests(&data);
>>>>  	if (!rq)
>>>>  		goto out_queue_exit;
>>>> @@ -524,7 +524,6 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
>>>>  		.q		= q,
>>>>  		.flags		= flags,
>>>>  		.cmd_flags	= op,
>>>> -		.rq_flags	= q->elevator ? RQF_ELV : 0,
>>>>  		.nr_tags	= 1,
>>>>  	};
>>>>  	u64 alloc_time_ns = 0;
>>>> @@ -551,6 +550,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
>>>>  	ret = blk_queue_enter(q, flags);
>>>>  	if (ret)
>>>>  		return ERR_PTR(ret);
>>>> +	data.rq_flags	= q->elevator ? RQF_ELV : 0,
>>>
>>> Don't think that will compile, but I guess the point is that we can't do
>>> this assignment before queue enter, in case we're in the midst of
>>> switching schedulers. Which is indeed a valid concern.
>>
>> Something like the below. Maybe? On top of the for-next that was already
>> pulled in.
>>
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index b01e05e02277..121f1898d529 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -433,9 +433,11 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data)
>>  	if (data->cmd_flags & REQ_NOWAIT)
>>  		data->flags |= BLK_MQ_REQ_NOWAIT;
>>  
>> -	if (data->rq_flags & RQF_ELV) {
>> +	if (q->elevator) {
>>  		struct elevator_queue *e = q->elevator;
>>  
>> +		data->rq_flags |= RQF_ELV;
>> +
>>  		/*
>>  		 * Flush/passthrough requests are special and go directly to the
>>  		 * dispatch list. Don't include reserved tags in the
>> @@ -494,7 +496,6 @@ struct request *blk_mq_alloc_request(struct request_queue *q, unsigned int op,
>>  		.q		= q,
>>  		.flags		= flags,
>>  		.cmd_flags	= op,
>> -		.rq_flags	= q->elevator ? RQF_ELV : 0,
>>  		.nr_tags	= 1,
>>  	};
>>  	struct request *rq;
>> @@ -524,7 +525,6 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
>>  		.q		= q,
>>  		.flags		= flags,
>>  		.cmd_flags	= op,
>> -		.rq_flags	= q->elevator ? RQF_ELV : 0,
>>  		.nr_tags	= 1,
>>  	};
>>  	u64 alloc_time_ns = 0;
>> @@ -565,6 +565,8 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
>>  
>>  	if (!q->elevator)
>>  		blk_mq_tag_busy(data.hctx);
>> +	else
>> +		data.rq_flags |= RQF_ELV;
>>  
>>  	ret = -EWOULDBLOCK;
>>  	tag = blk_mq_get_tag(&data);
>> @@ -2560,7 +2562,6 @@ void blk_mq_submit_bio(struct bio *bio)
>>  			.q		= q,
>>  			.nr_tags	= 1,
>>  			.cmd_flags	= bio->bi_opf,
>> -			.rq_flags	= q->elevator ? RQF_ELV : 0,
>>  		};
> 
> The above patch looks fine.
> 
> BTW, 9ede85cb670c ("block: move queue enter logic into
> blk_mq_submit_bio()") moves the queue enter into blk_mq_submit_bio(),
> which seems dangerous, especially blk_mq_sched_bio_merge() needs
> hctx/ctx which requires q_usage_counter to be grabbed.

I think the best solution is to enter just for that as well, and just
retain that enter state. I'll update the patch, there's some real fixes
in there too for the batched alloc. Will post them later today.

-- 
Jens Axboe


  reply	other threads:[~2021-11-03 15:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-02  6:42 [bug report] WARNING: CPU: 1 PID: 1386 at block/blk-mq-sched.c:432 blk_mq_sched_insert_request+0x54/0x178 Yi Zhang
2021-11-02 19:00 ` Steffen Maier
2021-11-02 19:02   ` Jens Axboe
2021-11-02 20:03     ` Jens Axboe
2021-11-03  2:21       ` Yi Zhang
2021-11-03  3:21         ` Jens Axboe
2021-11-03  3:28           ` Daejun Park
2021-11-03  3:51           ` Ming Lei
2021-11-03  3:54             ` Jens Axboe
2021-11-03  4:00               ` Yi Zhang
2021-11-03 19:03                 ` Jens Axboe
2021-11-05 11:13                   ` Yi Zhang
2021-11-03 11:59               ` Jens Axboe
2021-11-03 13:59                 ` Yi Zhang
2021-11-03 14:26                   ` Jens Axboe
2021-11-03 14:57                   ` Ming Lei
2021-11-03 15:03                     ` Jens Axboe
2021-11-03 15:09                       ` Ming Lei
2021-11-03 15:12                         ` Jens Axboe
2021-11-03 15:10                       ` Jens Axboe
2021-11-03 15:16                         ` Ming Lei
2021-11-03 15:41                           ` Jens Axboe [this message]
2021-11-03 15:49                             ` Jens Axboe
2021-11-03 16:09                               ` Ming Lei
2021-11-03 16:36                                 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=30e8f85a-bc43-675f-6594-93c2c60ebd18@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=maier@linux.ibm.com \
    --cc=ming.lei@redhat.com \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox