All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: "Zhang, Qiang1" <qiang1.zhang@intel.com>
Cc: "paulmck@kernel.org" <paulmck@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Neeraj Upadhyay <quic_neeraju@quicinc.com>,
	Uladzislau Rezki <uladzislau.rezki@sony.com>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [PATCH] rcu/nocb: Clear rdp offloaded flags when rcuop/rcuog kthreads spawn failed
Date: Wed, 9 Mar 2022 22:06:57 +0100	[thread overview]
Message-ID: <20220309210657.GA68899@lothringen> (raw)
In-Reply-To: <PH0PR11MB5880F450C2DDD04D4C76F814DA099@PH0PR11MB5880.namprd11.prod.outlook.com>

On Tue, Mar 08, 2022 at 07:37:24AM +0000, Zhang, Qiang1 wrote:
> 
> On Mon, Feb 28, 2022 at 05:36:29PM +0800, Zqiang wrote:
> > When CONFIG_RCU_NOCB_CPU is enabled and 'rcu_nocbs' is set, the rcuop 
> > and rcuog kthreads is created. however the rcuop or rcuog kthreads 
> > creation may fail, if failed, clear rdp offloaded flags.
> > 
> > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > ---
> >  kernel/rcu/tree_nocb.h | 14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h index 
> > 46694e13398a..94b279147954 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -1246,7 +1246,7 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
> >  				"rcuog/%d", rdp_gp->cpu);
> >  		if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__)) {
> >  			mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex);
> > -			return;
> > +			goto end;
> >  		}
> >  		WRITE_ONCE(rdp_gp->nocb_gp_kthread, t);
> >  		if (kthread_prio)
> > @@ -1258,12 +1258,22 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
> >  	t = kthread_run(rcu_nocb_cb_kthread, rdp,
> >  			"rcuo%c/%d", rcu_state.abbr, cpu);
> >  	if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
> > -		return;
> > +		goto end;
> >  
> >  	if (kthread_prio)
> >  		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> >  	WRITE_ONCE(rdp->nocb_cb_kthread, t);
> >  	WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
> > +	return;
> > +end:
> > +	if (cpumask_test_cpu(cpu, rcu_nocb_mask)) {
> > +		rcu_segcblist_offload(&rdp->cblist, false);
> > +		rcu_segcblist_clear_flags(&rdp->cblist,
> > +				SEGCBLIST_KTHREAD_CB | SEGCBLIST_KTHREAD_GP);
> > +		rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_LOCKING);
> > +		rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_RCU_CORE);
> > +	}
> >>
> >>Thanks you, consequences are indeed bad otherwise because the target is considered offloaded but nothing actually handles the callbacks.
> >>
> >>A few issues though:
> >>
> >>* The rdp_gp kthread may be running concurrently. If it's iterating this rdp and
> >>  the SEGCBLIST_LOCKING flag is cleared in the middle, rcu_nocb_unlock() won't
> >>  release (among many other possible issues).
> >>
> >>* we should clear the cpu from rcu_nocb_mask or we won't be able to later
> >>  re-offload it.
> >>
> >>* we should then delete the rdp from the group list:
> >>
> >>     list_del_rcu(&rdp->nocb_entry_rdp);
> >>
> >>So ideally we should call rcu_nocb_rdp_deoffload(). But then bear in mind:
> >>
> >>1) We must lock rcu_state.barrier_mutex and hotplug read lock. But since we
> >>   are calling rcutree_prepare_cpu(), we maybe holding hotplug write lock
> >>   already.
> >>
> >>   Therefore we first need to invert the locking dependency order between
> >>   rcu_state.barrier_mutex and hotplug lock and then just lock the barrier_mutex
> >>   before calling rcu_nocb_rdp_deoffload() from our failure path.
> >>   
> >>
> >>2) On rcu_nocb_rdp_deoffload(), handle non-existing nocb_gp and/or nocb_cb
> >>   kthreads. Make sure we are holding nocb_gp_kthread_mutex.
> 
> Sorry for my late reply,  Is the nocb_gp_kthread_mutex really necessary?
> Because the cpu online/offline is serial operation,  It is protected by  cpus_write_lock()

And you're right! But some people are working on making cpu_up() able to work
in parallel for faster bring-up on boot.

  reply	other threads:[~2022-03-09 21:07 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28  9:36 [PATCH] rcu/nocb: Clear rdp offloaded flags when rcuop/rcuog kthreads spawn failed Zqiang
2022-03-03 16:49 ` Frederic Weisbecker
2022-03-08  7:37   ` Zhang, Qiang1
2022-03-09 21:06     ` Frederic Weisbecker [this message]
2022-03-10  2:37       ` Zhang, Qiang1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220309210657.GA68899@lothringen \
    --to=frederic@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=qiang1.zhang@intel.com \
    --cc=quic_neeraju@quicinc.com \
    --cc=uladzislau.rezki@sony.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.