Re: 回复: [PATCH] rcu: shrink each possible cpu krcp

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Uladzislau Rezki <urezki@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Joel Fernandes <joel@joelfernandes.org>,
	Uladzislau Rezki <urezki@gmail.com>,
	"Zhang, Qiang" <Qiang.Zhang@windriver.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>, rcu <rcu@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: 回复: [PATCH] rcu: shrink each possible cpu krcp
Date: Mon, 31 Aug 2020 11:30:32 +0200	[thread overview]
Message-ID: <20200831093032.GA19139@pc636> (raw)
In-Reply-To: <20200821153328.GH2855@paulmck-ThinkPad-P72>

On Fri, Aug 21, 2020 at 08:33:28AM -0700, Paul E. McKenney wrote:
> On Thu, Aug 20, 2020 at 06:39:57PM -0400, Joel Fernandes wrote:
> > On Wed, Aug 19, 2020 at 05:58:08PM +0200, Uladzislau Rezki wrote:
> > > On Wed, Aug 19, 2020 at 08:21:59AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Aug 19, 2020 at 09:56:54AM -0400, Joel Fernandes wrote:
> > > > > On Wed, Aug 19, 2020 at 03:00:55AM +0000, Zhang, Qiang wrote:
> > > > > > 
> > > > > > 
> > > > > > ________________________________________
> > > > > > 发件人: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> 代表 Joel Fernandes <joel@joelfernandes.org>
> > > > > > 发送时间: 2020年8月19日 8:04
> > > > > > 收件人: Paul E. McKenney
> > > > > > 抄送: Uladzislau Rezki; Zhang, Qiang; Josh Triplett; Steven Rostedt; Mathieu Desnoyers; Lai Jiangshan; rcu; LKML
> > > > > > 主题: Re: [PATCH] rcu: shrink each possible cpu krcp
> > > > > > 
> > > > > > On Tue, Aug 18, 2020 at 6:02 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > 
> > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > > > index b8ccd7b5af82..6decb9ad2421 100644
> > > > > > > > --- a/kernel/rcu/tree.c
> > > > > > > > +++ b/kernel/rcu/tree.c
> > > > > > > > @@ -2336,10 +2336,15 @@ int rcutree_dead_cpu(unsigned int cpu)
> > > > > > > >  {
> > > > > > > >         struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> > > > > > > >         struct rcu_node *rnp = rdp->mynode;  /* Outgoing CPU's rdp & rnp. */
> > > > > > > > +       struct kfree_rcu_cpu *krcp;
> > > > > > > >
> > > > > > > >         if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> > > > > > > >                 return 0;
> > > > > > > >
> > > > > > > > +       /* Drain the kcrp of this CPU. IRQs should be disabled? */
> > > > > > > > +       krcp = this_cpu_ptr(&krc)
> > > > > > > > +       schedule_delayed_work(&krcp->monitor_work, 0);
> > > > > > > > +
> > > > > > > >
> > > > > > > > A cpu can be offlined and its krp will be stuck until a shrinker is involved.
> > > > > > > > Maybe be never.
> > > > > > >
> > > > > > > Does the same apply to its kmalloc() per-CPU caches?  If so, I have a
> > > > > > > hard time getting too worried about it.  ;-)
> > > > > > 
> > > > > > >Looking at slab_offline_cpu() , that calls cancel_delayed_work_sync()
> > > > > > >on the cache reaper who's job is to flush the per-cpu caches. So I
> > > > > > >believe during CPU offlining, the per-cpu slab caches are flushed.
> > > > > > >
> > > > > > >thanks,
> > > > > > >
> > > > > >  >- Joel
> > > > > > 
> > > > > > When cpu going offline, the slub or slab only flush free objects in offline
> > > > > > cpu cache,  put these free objects in node list  or return buddy system,
> > > > > > for those who are still in use, they still stay offline cpu cache.
> > > > > > 
> > > > > > If we want clean per-cpu "krcp" objects when cpu going offline.  we should
> > > > > > free "krcp" cache objects in "rcutree_offline_cpu", this func be called
> > > > > > before other rcu cpu offline func. and then "rcutree_offline_cpu" will be
> > > > > > called in "cpuhp/%u" per-cpu thread.
> > > > > > 
> > > > > 
> > > > > Could you please wrap text properly when you post to mailing list, thanks. I
> > > > > fixed it for you above.
> > > > > 
> > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > index 8ce77d9ac716..1812d4a1ac1b 100644
> > > > > > --- a/kernel/rcu/tree.c
> > > > > > +++ b/kernel/rcu/tree.c
> > > > > > @@ -3959,6 +3959,7 @@ int rcutree_offline_cpu(unsigned int cpu)
> > > > > >         unsigned long flags;
> > > > > >         struct rcu_data *rdp;
> > > > > >         struct rcu_node *rnp;
> > > > > > +       struct kfree_rcu_cpu *krcp;
> > > > > >  
> > > > > >         rdp = per_cpu_ptr(&rcu_data, cpu);
> > > > > >         rnp = rdp->mynode;
> > > > > > @@ -3970,6 +3971,11 @@ int rcutree_offline_cpu(unsigned int cpu)
> > > > > >  
> > > > > >         // nohz_full CPUs need the tick for stop-machine to work quickly
> > > > > >         tick_dep_set(TICK_DEP_BIT_RCU);
> > > > > > +
> > > > > > +       krcp = per_cpu_ptr(&krc, cpu);
> > > > > > +       raw_spin_lock_irqsave(&krcp->lock, flags);
> > > > > > +       schedule_delayed_work(&krcp->monitor_work, 0);
> > > > > > +       raw_spin_unlock_irqrestore(&krcp->lock, flags);
> > > > > >         return 0;
> > > > > 
> > > > > I realized the above is not good enough for what this is trying to do. Unlike
> > > > > the slab, the new kfree_rcu objects cannot always be drained / submitted to
> > > > > RCU because the previous batch may still be waiting for a grace period. So
> > > > > the above code could very well return with the yet-to-be-submitted kfree_rcu
> > > > > objects still in the cache.
> > > > > 
> > > > > One option is to spin-wait here for monitor_todo to be false and keep calling
> > > > > kfree_rcu_drain_unlock() till then.
> > > > > 
> > > > > But then that's not good enough either, because if new objects are queued
> > > > > when interrupts are enabled in the CPU offline path, then the cache will get
> > > > > new objects after the previous set was drained. Further, spin waiting may
> > > > > introduce deadlocks.
> > > > > 
> > > > > Another option is to switch the kfree_rcu() path to non-batching (so new
> > > > > objects cannot be cached in the offline path and are submitted directly to
> > > > > RCU), wait for a GP and then submit the work. But then not sure if 1-argument
> > > > > kfree_rcu() will like that.
> > > > 
> > > > Or spawn a workqueue that does something like this:
> > > > 
> > > > 1.	Get any pending kvfree_rcu() requests sent off to RCU.
> > > > 
> > > > 2.	Do an rcu_barrier().
> > > > 
> > > > 3.	Do the cleanup actions.
> > > > 
> > > > > Probably Qian's original fix for for_each_possible_cpus() is good enough for
> > > > > the shrinker case, and then we can tackle the hotplug one.
> > > > 
> > > > It might take some experimentation to find the best solution.
> > > > 
> > > 
> > > <snip>
> > > static void do_idle(void)
> > > {
> > > ...
> > >  while (!need_resched()) {
> > >   rmb();
> > > 
> > >   local_irq_disable();
> > > 
> > >   if (cpu_is_offline(cpu)) {
> > >    tick_nohz_idle_stop_tick();
> > >    cpuhp_report_idle_dead();
> > >        -> cpuhp_report_idle_dead(void)
> > >               -> rcu_report_dead(smp_processor_id());
> > >    arch_cpu_idle_dead();
> > >   }
> > > ...
> > > <snip>
> > > 
> > > We have the rcu_report_dead() callback. When it gets called IRQs are off
> > > and CPU that is in question is offline.
> > > 
> > >     krcp = per_cpu_ptr(&krc, cpu);
> > >     raw_spin_lock_irqsave(&krcp->lock, flags);
> > >     krcp->monotro_todo = true;
> > >     schedule_delayed_work(&krcp->monitor_work, 0);
> > >     raw_spin_unlock_irqrestore(&krcp->lock, flags);
> > > 
> > > If there is a batch that is in progress, the job will rearm itself.
> > > But i agree, it requires more experiments.
> > 
> > I chatted with Ulad and we believe the timer and/or (delayed) workqueue will
> > get migrated during the CPU offline path, so it is not an issue.
> > 
> > In this case, Qiang's initial patch suffices to fix the shrinker issue.
> 
> As in the patch that is currented in -rcu, correct?
> 
The "[PATCH] rcu: shrink each possible cpu krcp" will fix the
shrinker "issue" when CPU goes offline. So that was valid concern.

As for "main track", our drain work or a timer that triggers it
will be migrated anyway. Just to repeat what Joel wrote earlier,
we do not have any issues with this part.

--
Vlad Rezki

next prev parent reply	other threads:[~2020-08-31  9:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-14  6:45 [PATCH] rcu: shrink each possible cpu krcp qiang.zhang
2020-08-14 18:51 ` Uladzislau Rezki
2020-08-17 22:03   ` Joel Fernandes
2020-08-18 17:18     ` Paul E. McKenney
2020-08-18 19:00       ` Joel Fernandes
2020-08-18 21:03         ` Paul E. McKenney
2020-08-18 21:55           ` Uladzislau Rezki
2020-08-18 22:02             ` Paul E. McKenney
2020-08-19  0:04               ` Joel Fernandes
2020-08-19  3:00                 ` 回复: " Zhang, Qiang
2020-08-19 13:04                   ` Paul E. McKenney
2020-08-19 13:56                   ` Joel Fernandes
2020-08-19 15:21                     ` Paul E. McKenney
2020-08-19 15:54                       ` Joel Fernandes
2020-08-19 15:58                       ` Uladzislau Rezki
2020-08-20 22:39                         ` Joel Fernandes
2020-08-21 15:33                           ` Paul E. McKenney
2020-08-31  9:30                             ` Uladzislau Rezki [this message]
2020-09-09  6:35                             ` Zhang, Qiang
2020-09-09  7:03                               ` RCU: Question rcu_preempt_blocked_readers_cgp in rcu_gp_fqs_loop func Zhang, Qiang
2020-09-09 11:22                                 ` Paul E. McKenney
2020-09-10  3:25                                   ` 回复: " Zhang, Qiang
2020-09-14 20:06                                   ` Joel Fernandes
2020-08-19 11:22                 ` [PATCH] rcu: shrink each possible cpu krcp Uladzislau Rezki
2020-08-19 13:25                   ` Joel Fernandes
2020-08-18 23:25             ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200831093032.GA19139@pc636 \
    --to=urezki@gmail.com \
    --cc=Qiang.Zhang@windriver.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.