All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: jason <zhangqing.luo@oracle.com>, Tejun Heo <tj@kernel.org>,
	Guru Anbalagane <guru.anbalagane@oracle.com>,
	Feng Jin <joe.jin@oracle.com>,
	linux-kernel@vger.kernel.org
Subject: Re: blk-mq: takes hours for scsi scanning finish when thousands of LUNs
Date: Thu, 22 Oct 2015 10:06:09 -0600	[thread overview]
Message-ID: <56290971.9060403@kernel.dk> (raw)
In-Reply-To: <x49a8rahpcw.fsf@segfault.boston.devel.redhat.com>

On 10/22/2015 09:53 AM, Jeff Moyer wrote:
> Jens Axboe <axboe@kernel.dk> writes:
>
>>> I agree with the optimizing hot paths by cheaper percpu operation,
>>> but how much does it affect the performance?
>>
>> A lot, since the queue referencing happens twice per IO. The switch to
>> percpu was done to use shared/common code for this, the previous
>> version was a handrolled version of that.
>>
>>> as you know the switching causes delay, when the the LUN  number is
>>> increasing
>>> the delay is becoming higher, so do you have any idea
>>> about the problem?
>>
>> Tejun already outlined a good solution to the problem:
>>
>> "If percpu freezing is
>> happening during that, the right solution is moving finish_init to
>> late enough point so that percpu switching happens only after it's
>> known that the queue won't be abandoned."
>
> I'm sure I'm missing something, but I don't think that will work.
> blk_mq_update_tag_depth is freezing every single queue.  Those queues
> are already setup and will not go away.  So how will moving finish_init
> later in the queue setup fix this?  The patch Jason provided most likely
> works because __percpu_ref_switch_to_atomic doesn't do anything.  The
> most important things it doesn't do are:
> 1) percpu_ref_get(mq_usage_counter), followed by
> 2) call_rcu_sched()
>
> It seems likely to me that forcing an rcu grace period for every single
> LUN attached to a particular host is what's causing the delay.
>
> And now you'll tell me how I've got that all wrong.  ;-)

Haha, no I think that is absolutely right. We've seen these bugs a lot, 
having thousands of serialized rcu grace period waits, this is just one 
more. The patch that Jason sent just bypassed the percpu switch, which 
we can't do.

> Anyway, I think what Jason had initially suggested, would work:
>
>    "if this thing must be done, as the code below shows just changing
>     flags depending on 'shared' variable why shouldn't we store the
>     previous result of 'shared' and compare with the current result, if
>     it's unchanged, nothing will be done and avoid looping all queues in
>     list."
>
> I think that percolating BLK_MQ_F_TAG_SHARED up to the tag set would
> allow newly created hctxs to simply inherit the shared state (in
> blk_mq_init_hctx), and you won't need to freeze every queue in order to
> guarantee that.
>
> I was writing a patch to that effect.  I've now stopped as I want to
> make sure I'm not off in the weeds.  :)

If that is where the delay is done, then yes, that should fix it and be 
a trivial patch.

-- 
Jens Axboe


  reply	other threads:[~2015-10-22 16:06 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-19 14:40 blk-mq: takes hours for scsi scanning finish when thousands of LUNs Zhangqing Luo
2015-10-22  8:47 ` Tejun Heo
2015-10-22  9:15   ` jason
2015-10-22 15:14     ` Jens Axboe
2015-10-22 15:53       ` Jeff Moyer
2015-10-22 16:06         ` Jens Axboe [this message]
2015-10-22 19:04           ` Jeff Moyer
2015-10-23  9:48             ` jason
2015-10-23  0:57     ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56290971.9060403@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=guru.anbalagane@oracle.com \
    --cc=jmoyer@redhat.com \
    --cc=joe.jin@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=zhangqing.luo@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.