From: Jens Axboe <axboe@kernel.dk>
To: jason <zhangqing.luo@oracle.com>, Tejun Heo <tj@kernel.org>
Cc: Guru Anbalagane <guru.anbalagane@oracle.com>,
Feng Jin <joe.jin@oracle.com>,
linux-kernel@vger.kernel.org
Subject: Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs
Date: Thu, 22 Oct 2015 09:14:32 -0600 [thread overview]
Message-ID: <5628FD58.4090909@kernel.dk> (raw)
In-Reply-To: <5628A91E.60208@oracle.com>
On 10/22/2015 03:15 AM, jason wrote:
>
>
> On Thursday, October 22, 2015 04:47 PM, Tejun Heo wrote:
>> Hello,
>>
>> On Mon, Oct 19, 2015 at 07:40:13AM -0700, Zhangqing Luo wrote:
>> ....
>> > So every time blk_mq_freeze_queue_start, it runs in this way
>> >
>> > blk_mq_freeze_queue_start
>> > ->percpu_ref_kill->percpu_ref_kill_and_confirm
>> > ->__percpu_ref_switch_to_atomic
>> > ->call_rcu_sched(&ref->rcu,percpu_ref_switch_to_atomic_rcu)
>> >
>> > and blk_mq_freeze_queue_wait blocks on queue->mq_usage_counter
>> > as it is not zero, and wake up by percpu_ref_switch_to_atomic_rcu
>> > after a grace period
>> >
>> >
>> > My question here is why should we change ref to PERCPU at
>> blk_mq_finish_init?
>> > because of this changing, delay appears.
>>
>> Because percpu operation is way cheaper than atomic ones and we want
>> to optimize hot paths (request issue and completion) over cold paths
>> (init and config changes). That's the whole point of percpu
>> refcnting.
>>
>> The reason why percpu ref starts in atomic mode is to avoid expensive
>> percpu freezing if the queue is created and abandoned in quick
>> succession as SCSI does during LUN scanning. If percpu freezing is
>> happening during that, the right solution is moving finish_init to
>> late enough point so that percpu switching happens only after it's
>> known that the queue won't be abandoned.
>>
>> Thanks.
>>
> I agree with the optimizing hot paths by cheaper percpu operation,
> but how much does it affect the performance?
A lot, since the queue referencing happens twice per IO. The switch to
percpu was done to use shared/common code for this, the previous version
was a handrolled version of that.
> as you know the switching causes delay, when the the LUN number is
> increasing
> the delay is becoming higher, so do you have any idea
> about the problem?
Tejun already outlined a good solution to the problem:
"If percpu freezing is
happening during that, the right solution is moving finish_init to
late enough point so that percpu switching happens only after it's
known that the queue won't be abandoned."
It'd be great if you could look into that. Your original patch
demonstrates exactly where the problem is, but it's not something that
can be applied of course.
--
Jens Axboe
next prev parent reply other threads:[~2015-10-22 15:14 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-19 14:40 blk-mq: takes hours for scsi scanning finish when thousands of LUNs Zhangqing Luo
2015-10-22 8:47 ` Tejun Heo
2015-10-22 9:15 ` jason
2015-10-22 15:14 ` Jens Axboe [this message]
2015-10-22 15:53 ` Jeff Moyer
2015-10-22 16:06 ` Jens Axboe
2015-10-22 19:04 ` Jeff Moyer
2015-10-23 9:48 ` jason
2015-10-23 0:57 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5628FD58.4090909@kernel.dk \
--to=axboe@kernel.dk \
--cc=guru.anbalagane@oracle.com \
--cc=joe.jin@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=zhangqing.luo@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).