From: "Raslan, KarimAllah" <karahmed@amazon.de>
To: "paulmck@linux.ibm.com" <paulmck@linux.ibm.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"mathieu.desnoyers@efficios.com" <mathieu.desnoyers@efficios.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"josh@joshtriplett.org" <josh@joshtriplett.org>,
"jiangshanlai@gmail.com" <jiangshanlai@gmail.com>
Subject: Re: [PATCH] rcu: Benefit from expedited grace period in __wait_rcu_gp
Date: Fri, 19 Oct 2018 19:45:51 +0000 [thread overview]
Message-ID: <1539978350.11839.6.camel@amazon.de> (raw)
In-Reply-To: <20181019123106.GX2674@linux.ibm.com>
On Fri, 2018-10-19 at 05:31 -0700, Paul E. McKenney wrote:
> On Fri, Oct 19, 2018 at 02:49:05AM +0200, KarimAllah Ahmed wrote:
> >
> > When expedited grace-period is set, both synchronize_sched
> > synchronize_rcu_bh can be optimized to have a significantly lower latency.
> >
> > Improve wait_rcu_gp handling to also account for expedited grace-period.
> > The downside is that wait_rcu_gp will not wait anymore for all RCU variants
> > concurrently when an expedited grace-period is set, however, given the
> > improved latency it does not really matter.
> >
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Josh Triplett <josh@joshtriplett.org>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
>
> Cute!
>
> Unfortunately, there are a few problems with this patch:
>
> 1. I will be eliminating synchronize_rcu_mult() due to the fact that
> the upcoming RCU flavor consolidation eliminates its sole caller.
> See 5fc9d4e000b1 ("rcu: Eliminate synchronize_rcu_mult()")
> in my -rcu tree. This would of course also eliminate the effects
> of this patch.
Your patch covers our use-case already, but I still think that the semantics
for wait_rcu_gp is not clear to me.
The problem for us was that sched_cpu_deactivate would call
synchronize_rcu_mult which does not check for "expedited" at all. So even
though we are already using rcu_expedited sysctl variable, synchronize_rcu_mult
was just ignoring it.
That being said, I indeed overlooked rcu_normal and that it takes precedence
over expedited and I did not notice at all the deadlock you mentioned below!
That can however be easily fixed by also checking for !rcu_gp_is_normal.
>
> 2. The real-time guys' users are not going to be at all happy
> with the IPIs resulting from the _expedited() API members.
> Yes, they can boot with rcupdate.rcu_normal=1, but they don't
> always need that big a hammer, and use of this kernel parameter
> can slow down boot, hibernation, suspend, network configuration,
> and much else besides. We therefore don't want them to have to
> use rcupdate.rcu_normal=1 unless absolutely necessary.
I might be missing something here. Why would they need to "explicitly" use
rcu_normal? If rcu_expedited is set, would not the expected behavior is to call
into the expedited version?
My patch should only activate *expedited* if and only if it is set.
I think I might be misunderstanding the expected behavior
from synchronize_rcu_mult. My understanding is that something like:
synchronize_rcu_mult(call_rcu_sched) and synchronize_rcu() should have an
identical behavior, right?
At least in this commit:
commit d7d34d5e46140 ("sched: Rely on synchronize_rcu_mult() de-duplication")
.. the change clearly gives the impression that they can be used
interchangeably. The problem is that this is not true when you look at the
implementation. One of them (i.e. synchronize_rcu) will respect the
expedite_rcu flag set by sysfs while the other (i.e. synchronize_rcu_mult)
simply ignores it.
So my patch is about making sure that both of the variants actually respect
it.
> 3. If the real-time guys' users were to have booted with
> rcupdate.rcu_normal=1, then synchronize_sched_expedited()
> would invoke _synchronize_rcu_expedited, which would invoke
> wait_rcu_gp(), which would invoke _wait_rcu_gp() which would
> invoke __wait_rcu_gp(), which, given your patch, would in turn
> invoke synchronize_sched_expedited(). This situation could
> well prevent their systems from meeting their response-time
> requirements.
>
> So I cannot accept this patch nor for that matter any similar patch.
>
> But what were you really trying to get done here? If you were thinking
> of adding another synchronize_rcu_mult(), the flavor consolidation will
> make that unnecessary in most cases. If you are trying to speed up
> CPU-hotplug operations, I suggest using the rcu_expedited sysctl variable
> when taking a CPU offline. If something else, please let me know what
> it is so that we can work out how the problem might best be solved.
>
> Thanx, Paul
>
> >
> > ---
> > kernel/rcu/update.c | 34 ++++++++++++++++++++++++++++------
> > 1 file changed, 28 insertions(+), 6 deletions(-)
> >
> > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > index 68fa19a..44b8817 100644
> > --- a/kernel/rcu/update.c
> > +++ b/kernel/rcu/update.c
> > @@ -392,13 +392,27 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
> > might_sleep();
> > continue;
> > }
> > - init_rcu_head_on_stack(&rs_array[i].head);
> > - init_completion(&rs_array[i].completion);
> > +
> > for (j = 0; j < i; j++)
> > if (crcu_array[j] == crcu_array[i])
> > break;
> > - if (j == i)
> > - (crcu_array[i])(&rs_array[i].head, wakeme_after_rcu);
> > + if (j != i)
> > + continue;
> > +
> > + if ((crcu_array[i] == call_rcu_sched ||
> > + crcu_array[i] == call_rcu_bh)
> > + && rcu_gp_is_expedited()) {
> > + if (crcu_array[i] == call_rcu_sched)
> > + synchronize_sched_expedited();
> > + else
> > + synchronize_rcu_bh_expedited();
> > +
> > + continue;
> > + }
> > +
> > + init_rcu_head_on_stack(&rs_array[i].head);
> > + init_completion(&rs_array[i].completion);
> > + (crcu_array[i])(&rs_array[i].head, wakeme_after_rcu);
> > }
> >
> > /* Wait for all callbacks to be invoked. */
> > @@ -407,11 +421,19 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
> > (crcu_array[i] == call_rcu ||
> > crcu_array[i] == call_rcu_bh))
> > continue;
> > +
> > + if ((crcu_array[i] == call_rcu_sched ||
> > + crcu_array[i] == call_rcu_bh)
> > + && rcu_gp_is_expedited())
> > + continue;
> > +
> > for (j = 0; j < i; j++)
> > if (crcu_array[j] == crcu_array[i])
> > break;
> > - if (j == i)
> > - wait_for_completion(&rs_array[i].completion);
> > + if (j != i)
> > + continue;
> > +
> > + wait_for_completion(&rs_array[i].completion);
> > destroy_rcu_head_on_stack(&rs_array[i].head);
> > }
> > }
> > --
> > 2.7.4
> >
>
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
next prev parent reply other threads:[~2018-10-19 19:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-19 0:49 [PATCH] rcu: Benefit from expedited grace period in __wait_rcu_gp KarimAllah Ahmed
2018-10-19 12:31 ` Paul E. McKenney
2018-10-19 19:45 ` Raslan, KarimAllah [this message]
2018-10-19 20:21 ` Paul E. McKenney
2018-10-23 15:13 ` Raslan, KarimAllah
2018-10-24 11:20 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1539978350.11839.6.camel@amazon.de \
--to=karahmed@amazon.de \
--cc=jiangshanlai@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=paulmck@linux.ibm.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox