From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH] percpu-refcount: relax limit on percpu_ref_reinit()
Date: Thu, 13 Sep 2018 06:11:40 +0800 [thread overview]
Message-ID: <20180912221139.GB15810@ming.t460p> (raw)
In-Reply-To: <20180912155321.GE2966370@devbig004.ftw2.facebook.com>
On Wed, Sep 12, 2018@08:53:21AM -0700, Tejun Heo wrote:
> Hello,
>
> On Wed, Sep 12, 2018@09:52:48AM +0800, Ming Lei wrote:
> > > If you killed and waited until kill finished, you should be able to
> > > re-init. Is it that you want to kill but abort killing in some cases?
> >
> > Yes, it can be re-init, just with the warning of WARN_ON_ONCE(!percpu_ref_is_zero(ref)).
>
> We can add another interface but it can't be re _init_.
OK.
>
> > > How do you then handle the race against release? Can you please
> >
> > The .release is only called at atomic mode, and once we switch to
> > percpu mode, .release can't be called at all. Or I may not follow you,
> > could you explain a bit the race with release?
>
> Yeah but what guards ->release() starting to run and then the ref
> being switched to percpu mode? Or maybe that doesn't matter?
OK, we may add synchronize_rcu() just after clearing the DEAD flag in
the new introduced helper to avoid the race.
>
> > > describe the exact usage you have on mind?
> >
> > Let me explain the use case:
> >
> > 1) nvme timeout comes
> >
> > 2) all pending requests are canceled, but won't be completed because
> > they have to be retried after the controller is recovered
> >
> > 3) meantime, the queue has to be frozen for avoiding new request, so
> > the refcount is killed via percpu_ref_kill().
> >
> > 4) after the queue is recovered(or the controller is reset successfully), it
> > isn't necessary to wait until the refcount drops zero, since it is fine to
> > reinit it by clearing DEAD and switching back to percpu mode from atomic mode.
> > And waiting for the refcount dropping to zero in the reset handler may trigger
> > IO hang if IO timeout happens again during reset.
>
> Does the recovery need the in-flight commands actually drained or does
> it just need to block new issues for a while. If latter, why is
The recovery needn't to drain the in-flight commands actually.
> percpu_ref even being used?
Just for avoiding to invent a new wheel, especially .q_usage_counter
has served for this purpose for long time.
>
> > So what I am trying to propose is the following usage:
> >
> > 1) percpu_ref_kill() on .q_usage_counter before recovering the controller for
> > preventing new requests from entering queue
>
> The way you're describing it, the above part is no different from
> having a global bool which gates new issues.
Right, but the global bool has to be checked in fast path, and the sync
between updating the flag and checking it has to be considered. Given
blk-mq has already used .q_usage_counter for this purpose, that is why
I suggest to scale percpu-refcount to cover this use case.
>
> > 2) controller is recovered
> >
> > 3) percpu_ref_reinit() on .q_usage_counter, and do not wait for
> > .q_usage_counter dropping to zero, then we needn't to wait in NVMe reset
> > handler which can be thought as single thread, and avoid IO hang when
> > new timeout is triggered during the waiting.
>
> This sounds possibly confused to me. Can you please explain how the
> recovery may hang if you wait for the ref to drain?
The reset handler can be thought as one single dedicated thread, if it hangs
in draining in-flight commands, then it won't be run again for dealing with
next timeout event.
thanks,
Ming
next parent reply other threads:[~2018-09-12 22:11 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20180911000049.GB30977@ming.t460p>
[not found] ` <20180911134836.GG1100574@devbig004.ftw2.facebook.com>
[not found] ` <20180911154540.GA10082@ming.t460p>
[not found] ` <20180911154959.GI1100574@devbig004.ftw2.facebook.com>
[not found] ` <20180911160532.GB10082@ming.t460p>
[not found] ` <20180911163032.GA2966370@devbig004.ftw2.facebook.com>
[not found] ` <20180911163443.GD10082@ming.t460p>
[not found] ` <20180911163856.GB2966370@devbig004.ftw2.facebook.com>
[not found] ` <20180912015247.GA12475@ming.t460p>
[not found] ` <20180912155321.GE2966370@devbig004.ftw2.facebook.com>
2018-09-12 22:11 ` Ming Lei [this message]
2018-09-18 12:49 ` [PATCH] percpu-refcount: relax limit on percpu_ref_reinit() Tejun Heo
2018-09-19 2:51 ` Ming Lei
2018-09-19 20:36 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180912221139.GB15810@ming.t460p \
--to=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox