From: Pingfan Liu <kernelfans@gmail.com>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: rcu@vger.kernel.org, "Paul E. McKenney" <paulmck@kernel.org>,
David Woodhouse <dwmw@amazon.co.uk>,
Neeraj Upadhyay <quic_neeraju@quicinc.com>,
Josh Triplett <josh@joshtriplett.org>,
Steven Rostedt <rostedt@goodmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Joel Fernandes <joel@joelfernandes.org>,
"Jason A. Donenfeld" <Jason@zx2c4.com>
Subject: Re: [PATCHv2 3/3] rcu: coordinate tick dependency during concurrent offlining
Date: Tue, 20 Sep 2022 15:26:28 +0800 [thread overview]
Message-ID: <YylrJCjYPrcT6xKJ@piliu.users.ipa.redhat.com> (raw)
In-Reply-To: <20220916134258.GB25891@lothringen>
On Fri, Sep 16, 2022 at 03:42:58PM +0200, Frederic Weisbecker wrote:
> On Thu, Sep 15, 2022 at 01:58:25PM +0800, Pingfan Liu wrote:
> > As Paul pointed out "The tick_dep_clear() is SMP-safe because it uses
> > atomic operations, but the problem is that if there are multiple
> > nohz_full CPUs going offline concurrently, the first CPU to invoke
> > rcutree_dead_cpu() will turn the tick off. This might require an
> > atomically manipulated counter to mediate the calls to
> > rcutree_dead_cpu(). "
> >
> > This patch introduces a new member ->dying to rcu_node, which reflects
> > the number of concurrent offlining cpu. TICK_DEP_BIT_RCU is set by
> > the first entrance and cleared by the last.
> >
> > Note: now, tick_dep_set() is put under the rnp->lock, but since it takes
> > no lock, no extra locking order is introduced.
> >
> > Suggested-by: "Paul E. McKenney" <paulmck@kernel.org>
> > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > Cc: Josh Triplett <josh@joshtriplett.org>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> > To: rcu@vger.kernel.org
> > ---
> > kernel/rcu/tree.c | 19 ++++++++++++++-----
> > kernel/rcu/tree.h | 1 +
> > 2 files changed, 15 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 8a829b64f5b2..f8bd0fc5fd2f 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2164,13 +2164,19 @@ int rcutree_dead_cpu(unsigned int cpu)
> > {
> > struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
> > struct rcu_node *rnp = rdp->mynode; /* Outgoing CPU's rdp & rnp. */
> > + unsigned long flags;
> > + u8 dying;
> >
> > if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> > return 0;
> >
> > WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> > - // Stop-machine done, so allow nohz_full to disable tick.
> > - tick_dep_clear(TICK_DEP_BIT_RCU);
> > + raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > + dying = --rnp->dying;
> > + if (!dying)
> > + // Stop-machine done, so allow nohz_full to disable tick.
> > + tick_dep_clear(TICK_DEP_BIT_RCU);
> > + raw_spin_lock_irqsave_rcu_node(rnp, flags);
>
> Note this is only locking the rdp's node, not the root node.
> Therefore if CPU 0 and CPU 256 are going off at the same time and they
> don't belong to the same node, the above won't protect against concurrent
> TICK_DEP_BIT_RCU set/clear.
>
Nice, thanks for the careful thoughts. How about moving the counting
place to the root node?
> My suspicion is that we don't need this TICK_DEP_BIT_RCU tick dependency
> anymore. I believe it was there because of issues that were fixed with:
>
> 53e87e3cdc15 (timers/nohz: Last resort update jiffies on nohz_full IRQ entry)
> and:
>
> a1ff03cd6fb9 (tick: Detect and fix jiffies update stall)
>
> It's unfortunately just suspicion because the reason for that tick dependency
> is unclear but I believe it should be safe to remove now.
>
I have gone through this tick dependency again, but got less.
I think at least from the RCU's viewpoint, it is useless since
multi_cpu_stop()->rcu_momentary_dyntick_idle() has eliminate the
requirement for tick interrupt.
Is there a way to have a convincing test so that these code can be removed?
Or this code will be got along with?
Thanks,
Pingfan
next prev parent reply other threads:[~2022-09-20 7:26 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-15 5:58 [PATCHv2 0/3] rcu: Enhance the capability to cope with concurrent cpu offlining/onlining Pingfan Liu
2022-09-15 5:58 ` [PATCHv2 1/3] rcu: Keep qsmaskinitnext fresh when rcutree_online_cpu() Pingfan Liu
2022-09-15 6:11 ` Pingfan Liu
2022-09-16 14:52 ` Frederic Weisbecker
2022-09-19 10:24 ` Pingfan Liu
2022-09-19 10:51 ` Frederic Weisbecker
2022-09-20 3:45 ` Pingfan Liu
2022-09-20 9:23 ` Frederic Weisbecker
2022-10-01 2:26 ` Joel Fernandes
2022-10-02 12:34 ` Pingfan Liu
2022-10-02 15:52 ` Joel Fernandes
2022-09-20 10:31 ` Frederic Weisbecker
2022-09-21 11:56 ` Pingfan Liu
2022-09-15 5:58 ` [PATCHv2 2/3] rcu: Resort to cpu_dying_mask for affinity when offlining Pingfan Liu
2022-09-16 14:23 ` Frederic Weisbecker
2022-09-19 4:33 ` Pingfan Liu
2022-09-19 10:34 ` Frederic Weisbecker
2022-09-20 3:16 ` Pingfan Liu
2022-09-20 9:00 ` Frederic Weisbecker
2022-09-20 9:38 ` Frederic Weisbecker
2022-09-21 11:48 ` Pingfan Liu
2022-09-15 5:58 ` [PATCHv2 3/3] rcu: coordinate tick dependency during concurrent offlining Pingfan Liu
2022-09-16 13:42 ` Frederic Weisbecker
2022-09-20 7:26 ` Pingfan Liu [this message]
2022-09-20 9:46 ` Frederic Weisbecker
2022-09-20 19:13 ` Paul E. McKenney
2022-09-22 9:29 ` Pingfan Liu
2022-09-22 13:54 ` Paul E. McKenney
2022-09-23 22:13 ` Frederic Weisbecker
2022-09-26 6:34 ` Pingfan Liu
2022-09-26 22:23 ` Paul E. McKenney
2022-09-27 9:59 ` Pingfan Liu
2022-09-29 8:19 ` Pingfan Liu
2022-09-29 8:20 ` Pingfan Liu
2022-09-30 13:04 ` Joel Fernandes
2022-10-02 14:06 ` Pingfan Liu
2022-10-02 16:11 ` Joel Fernandes
2022-10-02 16:24 ` Paul E. McKenney
2022-10-02 16:30 ` Joel Fernandes
2022-10-02 16:57 ` Paul E. McKenney
2022-10-02 16:59 ` Joel Fernandes
2022-09-30 15:44 ` Paul E. McKenney
2022-10-02 13:29 ` Pingfan Liu
2022-10-02 15:08 ` Frederic Weisbecker
2022-10-02 16:20 ` Paul E. McKenney
2022-10-02 16:20 ` Paul E. McKenney
[not found] ` <CAFgQCTtgLfc0NeYqyWk4Ew-pA9rMREjRjWSnQhYLv-V5117s9Q@mail.gmail.com>
2022-10-27 17:46 ` Paul E. McKenney
2022-10-31 3:24 ` Pingfan Liu
2022-11-03 16:51 ` Paul E. McKenney
2022-11-07 16:07 ` Paul E. McKenney
2022-11-09 18:55 ` Joel Fernandes
2022-11-18 12:08 ` Pingfan Liu
2022-11-18 23:30 ` Paul E. McKenney
2022-11-21 3:48 ` Pingfan Liu
2022-11-21 17:14 ` Paul E. McKenney
2022-11-17 14:39 ` Frederic Weisbecker
2022-11-18 1:45 ` Pingfan Liu
[not found] ` <CAFgQCTtNetv7v_Law=abPtngC8Gv6OGcGz9M_wWMxz_GAEWDUQ@mail.gmail.com>
2022-10-27 18:13 ` Paul E. McKenney
2022-10-31 2:10 ` Pingfan Liu
2022-09-26 16:13 ` Joel Fernandes
2022-09-27 9:42 ` Pingfan Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YylrJCjYPrcT6xKJ@piliu.users.ipa.redhat.com \
--to=kernelfans@gmail.com \
--cc=Jason@zx2c4.com \
--cc=dwmw@amazon.co.uk \
--cc=frederic@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=paulmck@kernel.org \
--cc=quic_neeraju@quicinc.com \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox