public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <neeraj.iitr10@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
	bigeasy@linutronix.de
Subject: Re: [PATCH] rcu/nocb: Add an option to ON/OFF an offloading from RT context
Date: Fri, 6 May 2022 18:22:26 +0200	[thread overview]
Message-ID: <YnVLQozNFvgk3olP@pc638.lan> (raw)
In-Reply-To: <20220505190915.GW1790663@paulmck-ThinkPad-P17-Gen-1>

> On Thu, May 05, 2022 at 12:16:41PM +0200, Uladzislau Rezki (Sony) wrote:
> > Introduce a RCU_NOCB_CPU_CB_BOOST kernel option. So a user can
> > decide if an offloading has to be done in a high-prio context or
> > not. Please note an option depends on RCU_NOCB_CPU and RCU_BOOST
> > parameters and by default it is off.
> > 
> > This patch splits the boosting preempted RCU readers and those
> > kthreads which directly responsible for driving expedited grace
> > periods forward with enabling/disabling the offloading from/to
> > SCHED_FIFO/SCHED_OTHER contexts.
> > 
> > The main reason of such split is, for example on Android there
> > are some workloads which require fast expedited grace period to
> > be done whereas offloading in RT context can lead to starvation
> > and hogging a CPU for a long time what is not acceptable for
> > latency sensitive environment. For instance:
> > 
> > <snip>
> > <...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270
> > <snip>
> > 
> > invoking 34 619 callbacks will take time thus making other CFS
> > tasks waiting in run-queue to be starved due to such behaviour.
> > 
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> 
> All good points!
> 
> Some questions and comments below.
> 
> Adding Sebastian on CC for his perspective.
> 
> 						Thanx, Paul
> 
> > ---
> >  kernel/rcu/Kconfig     | 14 ++++++++++++++
> >  kernel/rcu/tree.c      |  5 ++++-
> >  kernel/rcu/tree_nocb.h |  3 ++-
> >  3 files changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> > index 27aab870ae4c..074630b94902 100644
> > --- a/kernel/rcu/Kconfig
> > +++ b/kernel/rcu/Kconfig
> > @@ -275,6 +275,20 @@ config RCU_NOCB_CPU_DEFAULT_ALL
> >  	  Say Y here if you want offload all CPUs by default on boot.
> >  	  Say N here if you are unsure.
> >  
> > +config RCU_NOCB_CPU_CB_BOOST
> > +	bool "Perform offloading from real-time kthread"
> > +	depends on RCU_NOCB_CPU && RCU_BOOST
> > +	default n
> 
> I understand that you need this to default to "n" on your systems.
> However, other groups already using callback offloading should not see
> a sudden change.  I don't see an Android-specific defconfig file, but
> perhaps something in drivers/android/Kconfig?
> 
> One easy way to make this work would be to invert the sense of this
> Kconfig option ("RCU_NOCB_CB_NO_BOOST"?), continue having it default to
> "n", but then select it somewhere in drivers/android/Kconfig.  But I
> would not be surprised if there is a better way.
> 
It was done deliberately, i mean off by default. Because the user has to
think before enabling it for its workloads. It is not a big issue for
kthreads which drive a grace period forward, because their context runtime
i find pretty short. Whereas an offloading callback kthread can stuck
for a long time depending on workloads.

Also, i put it that way because initially those kthreads were staying
as SCHED_NORMAL even though the RCU_BOOST was set in kernel config.

<snip>
commit c8b16a65267e35ecc5621dbc81cbe7e5b0992fce
Author: Alison Chaiken <achaiken@aurora.tech>
Date:   Tue Jan 11 15:32:52 2022 -0800

    rcu: Elevate priority of offloaded callback threads
    
    When CONFIG_PREEMPT_RT=y, the rcutree.kthread_prio command-line
    parameter signals initialization code to boost the priority of rcuc
    callbacks to the designated value.  With the additional
    CONFIG_RCU_NOCB_CPU=y configuration and an additional rcu_nocbs
    command-line parameter, the callbacks on the listed cores are
    offloaded to new rcuop kthreads that are not pinned to the cores whose
    post-grace-period work is performed.  While the rcuop kthreads perform
    the same function as the rcuc kthreads they offload, the kthread_prio
    parameter only boosts the priority of the rcuc kthreads.  Fix this
    inconsistency by elevating rcuop kthreads to the same priority as the rcuc
    kthreads.
    
    Signed-off-by: Alison Chaiken <achaiken@aurora.tech>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
<snip>

I have a doubt that it is needed for CONFIG_PREEMPT_RT=y. The reason i mentioned
above it is a source of extra latency. That is why i have made it inactive by default.

Any thoughts?

> > +	help
> > +	  Use this option to offload callbacks from the SCHED_FIFO context
> > +	  to make the process faster. As a side effect of this approach is
> > +	  a latency especially for the SCHED_OTHER tasks which will not be
> > +	  able to preempt an offloading kthread. That latency depends on a
> > +	  number of callbacks to be invoked.
> > +
> > +	  Say Y here if you want to set RT priority for offloading kthreads.
> > +	  Say N here if you are unsure.
> > +
> >  config TASKS_TRACE_RCU_READ_MB
> >  	bool "Tasks Trace RCU readers use memory barriers in user and idle"
> >  	depends on RCU_EXPERT && TASKS_TRACE_RCU
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9dc4c4e82db6..d769a15bc0e3 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -154,7 +154,10 @@ static void sync_sched_exp_online_cleanup(int cpu);
> >  static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
> >  static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
> >  
> > -/* rcuc/rcub/rcuop kthread realtime priority */
> > +/*
> > + * rcuc/rcub/rcuop kthread realtime priority. The former
> > + * depends on if CONFIG_RCU_NOCB_CPU_CB_BOOST is set.
> 
> Aren't the rcuo[ps] kthreads controlled by the RCU_NOCB_CPU_CB_BOOST
> Kconfig option?  (As opposed to the "former", which is "rcuc".)
> 
The CONFIG_RCU_NOCB_CPU_CB_BOOST controls only the last what is
the rcuo CB kthread or "rcuo%c/%d" name. Sorry it is not "former"
it is the last in the rcuc/rcub/rcuop sequence. It was a typo :)

> > + */
> >  static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
> >  module_param(kthread_prio, int, 0444);
> >  
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > index 60cc92cc6655..a2823be9b1d0 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
> >  	if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
> >  		goto end;
> >  
> > -	if (kthread_prio)
> > +	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST))
> 
> Don't we need both non-zero kthread_prio and the proper setting of the
> new Kconfig option before we run it at SCHED_FIFO?
> 
> Yes, we could rely on sched_setscheduler_nocheck() erroring out in
> that case, but that sounds like an accident waiting to happen.
> 
As far as i see it is odd, because the "kthread_prio" is verified so
there is a sanity check to check if the value is correct for SCHED_FIFO
case and does some adjustment if not. There is sanitize_kthread_prio()
that does all trick.

Looking at the kthread_prio variable. If it is set all the code that
takes into account of it switches to SCHED_FIFO class. Maybe rename it
to something kthread_rt_prio? It might be a bad idea though because of
former dependencies of distros and so on :)



--
Uladzislau Rezki

  reply	other threads:[~2022-05-06 16:22 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-05 10:16 [PATCH] rcu/nocb: Add an option to ON/OFF an offloading from RT context Uladzislau Rezki (Sony)
2022-05-05 19:09 ` Paul E. McKenney
2022-05-06 16:22   ` Uladzislau Rezki [this message]
2022-05-06 18:24     ` Paul E. McKenney
2022-05-07  9:11       ` Uladzislau Rezki
2022-05-07 22:32         ` Paul E. McKenney
2022-05-08  6:28           ` Joel Fernandes
2022-05-08 21:32             ` Paul E. McKenney
2022-05-09  0:17               ` Joel Fernandes
2022-05-09  3:37                 ` Paul E. McKenney
2022-05-09 17:17                   ` Joel Fernandes
2022-05-09 18:14                     ` Paul E. McKenney
2022-05-09 18:28                       ` Uladzislau Rezki
2022-05-09 18:39                         ` Paul E. McKenney
2022-05-09 18:43                           ` Uladzislau Rezki
2022-05-10 14:09                           ` Steven Rostedt
2022-05-10 14:24                             ` Paul E. McKenney
2022-05-10 14:35                             ` Uladzislau Rezki
2022-05-10 14:01                         ` Steven Rostedt
2022-05-11 13:39                           ` Uladzislau Rezki
2022-05-11 14:29                             ` Steven Rostedt
2022-05-11 14:51                               ` Uladzislau Rezki
2022-05-11 14:53                                 ` Uladzislau Rezki
2022-05-11 17:17                                   ` Uladzislau Rezki
2022-05-16 16:22                                     ` Steven Rostedt
2022-05-16 16:42                                       ` Uladzislau Rezki
2022-05-10 14:07   ` Sebastian Andrzej Siewior
2022-05-10 17:14     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnVLQozNFvgk3olP@pc638.lan \
    --to=urezki@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neeraj.iitr10@gmail.com \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox