Re: [PATCH] rcu/nocb: Add an option to ON/OFF an offloading from RT context

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Uladzislau Rezki <urezki@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.iitr10@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
	bigeasy@linutronix.de
Subject: Re: [PATCH] rcu/nocb: Add an option to ON/OFF an offloading from RT context
Date: Fri, 6 May 2022 18:22:26 +0200	[thread overview]
Message-ID: <YnVLQozNFvgk3olP@pc638.lan> (raw)
In-Reply-To: <20220505190915.GW1790663@paulmck-ThinkPad-P17-Gen-1>

> On Thu, May 05, 2022 at 12:16:41PM +0200, Uladzislau Rezki (Sony) wrote:
> > Introduce a RCU_NOCB_CPU_CB_BOOST kernel option. So a user can
> > decide if an offloading has to be done in a high-prio context or
> > not. Please note an option depends on RCU_NOCB_CPU and RCU_BOOST
> > parameters and by default it is off.
> > 
> > This patch splits the boosting preempted RCU readers and those
> > kthreads which directly responsible for driving expedited grace
> > periods forward with enabling/disabling the offloading from/to
> > SCHED_FIFO/SCHED_OTHER contexts.
> > 
> > The main reason of such split is, for example on Android there
> > are some workloads which require fast expedited grace period to
> > be done whereas offloading in RT context can lead to starvation
> > and hogging a CPU for a long time what is not acceptable for
> > latency sensitive environment. For instance:
> > 
> > <snip>
> > <...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270
> > <snip>
> > 
> > invoking 34 619 callbacks will take time thus making other CFS
> > tasks waiting in run-queue to be starved due to such behaviour.
> > 
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> 
> All good points!
> 
> Some questions and comments below.
> 
> Adding Sebastian on CC for his perspective.
> 
> 						Thanx, Paul
> 
> > ---
> >  kernel/rcu/Kconfig     | 14 ++++++++++++++
> >  kernel/rcu/tree.c      |  5 ++++-
> >  kernel/rcu/tree_nocb.h |  3 ++-
> >  3 files changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> > index 27aab870ae4c..074630b94902 100644
> > --- a/kernel/rcu/Kconfig
> > +++ b/kernel/rcu/Kconfig
> > @@ -275,6 +275,20 @@ config RCU_NOCB_CPU_DEFAULT_ALL
> >  	  Say Y here if you want offload all CPUs by default on boot.
> >  	  Say N here if you are unsure.
> >  
> > +config RCU_NOCB_CPU_CB_BOOST
> > +	bool "Perform offloading from real-time kthread"
> > +	depends on RCU_NOCB_CPU && RCU_BOOST
> > +	default n
> 
> I understand that you need this to default to "n" on your systems.
> However, other groups already using callback offloading should not see
> a sudden change.  I don't see an Android-specific defconfig file, but
> perhaps something in drivers/android/Kconfig?
> 
> One easy way to make this work would be to invert the sense of this
> Kconfig option ("RCU_NOCB_CB_NO_BOOST"?), continue having it default to
> "n", but then select it somewhere in drivers/android/Kconfig.  But I
> would not be surprised if there is a better way.
> 
It was done deliberately, i mean off by default. Because the user has to
think before enabling it for its workloads. It is not a big issue for
kthreads which drive a grace period forward, because their context runtime
i find pretty short. Whereas an offloading callback kthread can stuck
for a long time depending on workloads.

Also, i put it that way because initially those kthreads were staying
as SCHED_NORMAL even though the RCU_BOOST was set in kernel config.

<snip>
commit c8b16a65267e35ecc5621dbc81cbe7e5b0992fce
Author: Alison Chaiken <achaiken@aurora.tech>
Date:   Tue Jan 11 15:32:52 2022 -0800

    rcu: Elevate priority of offloaded callback threads
    
    When CONFIG_PREEMPT_RT=y, the rcutree.kthread_prio command-line
    parameter signals initialization code to boost the priority of rcuc
    callbacks to the designated value.  With the additional
    CONFIG_RCU_NOCB_CPU=y configuration and an additional rcu_nocbs
    command-line parameter, the callbacks on the listed cores are
    offloaded to new rcuop kthreads that are not pinned to the cores whose
    post-grace-period work is performed.  While the rcuop kthreads perform
    the same function as the rcuc kthreads they offload, the kthread_prio
    parameter only boosts the priority of the rcuc kthreads.  Fix this
    inconsistency by elevating rcuop kthreads to the same priority as the rcuc
    kthreads.
    
    Signed-off-by: Alison Chaiken <achaiken@aurora.tech>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
<snip>

I have a doubt that it is needed for CONFIG_PREEMPT_RT=y. The reason i mentioned
above it is a source of extra latency. That is why i have made it inactive by default.

Any thoughts?

> > +	help
> > +	  Use this option to offload callbacks from the SCHED_FIFO context
> > +	  to make the process faster. As a side effect of this approach is
> > +	  a latency especially for the SCHED_OTHER tasks which will not be
> > +	  able to preempt an offloading kthread. That latency depends on a
> > +	  number of callbacks to be invoked.
> > +
> > +	  Say Y here if you want to set RT priority for offloading kthreads.
> > +	  Say N here if you are unsure.
> > +
> >  config TASKS_TRACE_RCU_READ_MB
> >  	bool "Tasks Trace RCU readers use memory barriers in user and idle"
> >  	depends on RCU_EXPERT && TASKS_TRACE_RCU
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9dc4c4e82db6..d769a15bc0e3 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -154,7 +154,10 @@ static void sync_sched_exp_online_cleanup(int cpu);
> >  static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
> >  static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
> >  
> > -/* rcuc/rcub/rcuop kthread realtime priority */
> > +/*
> > + * rcuc/rcub/rcuop kthread realtime priority. The former
> > + * depends on if CONFIG_RCU_NOCB_CPU_CB_BOOST is set.
> 
> Aren't the rcuo[ps] kthreads controlled by the RCU_NOCB_CPU_CB_BOOST
> Kconfig option?  (As opposed to the "former", which is "rcuc".)
> 
The CONFIG_RCU_NOCB_CPU_CB_BOOST controls only the last what is
the rcuo CB kthread or "rcuo%c/%d" name. Sorry it is not "former"
it is the last in the rcuc/rcub/rcuop sequence. It was a typo :)

> > + */
> >  static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
> >  module_param(kthread_prio, int, 0444);
> >  
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > index 60cc92cc6655..a2823be9b1d0 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
> >  	if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
> >  		goto end;
> >  
> > -	if (kthread_prio)
> > +	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST))
> 
> Don't we need both non-zero kthread_prio and the proper setting of the
> new Kconfig option before we run it at SCHED_FIFO?
> 
> Yes, we could rely on sched_setscheduler_nocheck() erroring out in
> that case, but that sounds like an accident waiting to happen.
> 
As far as i see it is odd, because the "kthread_prio" is verified so
there is a sanity check to check if the value is correct for SCHED_FIFO
case and does some adjustment if not. There is sanitize_kthread_prio()
that does all trick.

Looking at the kthread_prio variable. If it is set all the code that
takes into account of it switches to SCHED_FIFO class. Maybe rename it
to something kthread_rt_prio? It might be a bad idea though because of
former dependencies of distros and so on :)



--
Uladzislau Rezki

next prev parent reply	other threads:[~2022-05-06 16:22 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05 10:16 [PATCH] rcu/nocb: Add an option to ON/OFF an offloading from RT context Uladzislau Rezki (Sony)
2022-05-05 19:09 ` Paul E. McKenney
2022-05-06 16:22   ` Uladzislau Rezki [this message]
2022-05-06 18:24     ` Paul E. McKenney
2022-05-07  9:11       ` Uladzislau Rezki
2022-05-07 22:32         ` Paul E. McKenney
2022-05-08  6:28           ` Joel Fernandes
2022-05-08 21:32             ` Paul E. McKenney
2022-05-09  0:17               ` Joel Fernandes
2022-05-09  3:37                 ` Paul E. McKenney
2022-05-09 17:17                   ` Joel Fernandes
2022-05-09 18:14                     ` Paul E. McKenney
2022-05-09 18:28                       ` Uladzislau Rezki
2022-05-09 18:39                         ` Paul E. McKenney
2022-05-09 18:43                           ` Uladzislau Rezki
2022-05-10 14:09                           ` Steven Rostedt
2022-05-10 14:24                             ` Paul E. McKenney
2022-05-10 14:35                             ` Uladzislau Rezki
2022-05-10 14:01                         ` Steven Rostedt
2022-05-11 13:39                           ` Uladzislau Rezki
2022-05-11 14:29                             ` Steven Rostedt
2022-05-11 14:51                               ` Uladzislau Rezki
2022-05-11 14:53                                 ` Uladzislau Rezki
2022-05-11 17:17                                   ` Uladzislau Rezki
2022-05-16 16:22                                     ` Steven Rostedt
2022-05-16 16:42                                       ` Uladzislau Rezki
2022-05-10 14:07   ` Sebastian Andrzej Siewior
2022-05-10 17:14     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnVLQozNFvgk3olP@pc638.lan \
    --to=urezki@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.iitr10@gmail.com \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.