From: Jens Axboe <axboe@kernel.dk>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: jason <zhangqing.luo@oracle.com>, Tejun Heo <tj@kernel.org>,
Guru Anbalagane <guru.anbalagane@oracle.com>,
Feng Jin <joe.jin@oracle.com>,
linux-kernel@vger.kernel.org
Subject: Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs
Date: Thu, 22 Oct 2015 10:06:09 -0600 [thread overview]
Message-ID: <56290971.9060403@kernel.dk> (raw)
In-Reply-To: <x49a8rahpcw.fsf@segfault.boston.devel.redhat.com>
On 10/22/2015 09:53 AM, Jeff Moyer wrote:
> Jens Axboe <axboe@kernel.dk> writes:
>
>>> I agree with the optimizing hot paths by cheaper percpu operation,
>>> but how much does it affect the performance?
>>
>> A lot, since the queue referencing happens twice per IO. The switch to
>> percpu was done to use shared/common code for this, the previous
>> version was a handrolled version of that.
>>
>>> as you know the switching causes delay, when the the LUN number is
>>> increasing
>>> the delay is becoming higher, so do you have any idea
>>> about the problem?
>>
>> Tejun already outlined a good solution to the problem:
>>
>> "If percpu freezing is
>> happening during that, the right solution is moving finish_init to
>> late enough point so that percpu switching happens only after it's
>> known that the queue won't be abandoned."
>
> I'm sure I'm missing something, but I don't think that will work.
> blk_mq_update_tag_depth is freezing every single queue. Those queues
> are already setup and will not go away. So how will moving finish_init
> later in the queue setup fix this? The patch Jason provided most likely
> works because __percpu_ref_switch_to_atomic doesn't do anything. The
> most important things it doesn't do are:
> 1) percpu_ref_get(mq_usage_counter), followed by
> 2) call_rcu_sched()
>
> It seems likely to me that forcing an rcu grace period for every single
> LUN attached to a particular host is what's causing the delay.
>
> And now you'll tell me how I've got that all wrong. ;-)
Haha, no I think that is absolutely right. We've seen these bugs a lot,
having thousands of serialized rcu grace period waits, this is just one
more. The patch that Jason sent just bypassed the percpu switch, which
we can't do.
> Anyway, I think what Jason had initially suggested, would work:
>
> "if this thing must be done, as the code below shows just changing
> flags depending on 'shared' variable why shouldn't we store the
> previous result of 'shared' and compare with the current result, if
> it's unchanged, nothing will be done and avoid looping all queues in
> list."
>
> I think that percolating BLK_MQ_F_TAG_SHARED up to the tag set would
> allow newly created hctxs to simply inherit the shared state (in
> blk_mq_init_hctx), and you won't need to freeze every queue in order to
> guarantee that.
>
> I was writing a patch to that effect. I've now stopped as I want to
> make sure I'm not off in the weeds. :)
If that is where the delay is done, then yes, that should fix it and be
a trivial patch.
--
Jens Axboe
next prev parent reply other threads:[~2015-10-22 16:06 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-19 14:40 blk-mq: takes hours for scsi scanning finish when thousands of LUNs Zhangqing Luo
2015-10-22 8:47 ` Tejun Heo
2015-10-22 9:15 ` jason
2015-10-22 15:14 ` Jens Axboe
2015-10-22 15:53 ` Jeff Moyer
2015-10-22 16:06 ` Jens Axboe [this message]
2015-10-22 19:04 ` Jeff Moyer
2015-10-23 9:48 ` jason
2015-10-23 0:57 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56290971.9060403@kernel.dk \
--to=axboe@kernel.dk \
--cc=guru.anbalagane@oracle.com \
--cc=jmoyer@redhat.com \
--cc=joe.jin@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=zhangqing.luo@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).