From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-kernel@vger.kernel.org, Boqun Feng <boqun.feng@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
tglx@linutronix.de, Steven Rostedt <rostedt@goodmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Lai Jiangshan <jiangshanlai@gmail.com>
Subject: Re: [PATCH] rcu: Use cpus_read_lock() while looking at cpu_online_mask
Date: Tue, 11 Sep 2018 09:05:32 -0700 [thread overview]
Message-ID: <20180911160532.GJ4225@linux.vnet.ibm.com> (raw)
In-Reply-To: <20180910135615.tr3cvipwbhq6xug4@linutronix.de>
On Mon, Sep 10, 2018 at 03:56:16PM +0200, Sebastian Andrzej Siewior wrote:
> It was possible that sync_rcu_exp_select_cpus() enqueued something on
> CPU0 while CPU0 was offline. Such a work item wouldn't be processed
> until CPU0 gets back online. This problem was addressed in commit
> fcc6354365015 ("rcu: Make expedited GPs handle CPU 0 being offline"). I
> don't think the issue fully addressed.
>
> Assume grplo = 0 and grphi = 7 and sync_rcu_exp_select_cpus() is invoked
> on CPU1. The preempt_disable() section on CPU1 won't ensure that CPU0
> remains online between looking at cpu_online_mask and invoking
> queue_work_on() on CPU1.
>
> Use cpus_read_lock() to ensure that `cpu' is not going down between
> looking at cpu_online_mask at invoking queue_work_on() and waiting for
> its completion. It is added around the loop + flush_work() which is
> similar to work_on_cpu_safe() (and we can have multiple jobs running on
> NUMA systems).
Is this experimental or theoretical? If theoretical, the counter-theory
is that the stop-machine processing prevents any of the cpu_online_mask
bits from changing, though, yes, we would like to get rid of the
stop-machine processing. So either way, yes, the current state could
use some improvement.
But one problem with the patch below is that sync_rcu_exp_select_cpus()
can be called while the cpu_hotplug_lock is write-held. Or is that
somehow OK these days? Assuming not, how about the (untested) patch
below?
Thanx, Paul
> Fixes: fcc6354365015 ("rcu: Make expedited GPs handle CPU 0 being
> offline")
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> kernel/rcu/tree_exp.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index 01b6ddeb4f050..a104cf91e6b90 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -479,6 +479,7 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
> sync_exp_reset_tree(rsp);
> trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("select"));
>
> + cpus_read_lock();
> /* Schedule work for each leaf rcu_node structure. */
> rcu_for_each_leaf_node(rsp, rnp) {
> rnp->exp_need_flush = false;
> @@ -493,13 +494,11 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
> continue;
> }
> INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
> - preempt_disable();
> cpu = cpumask_next(rnp->grplo - 1, cpu_online_mask);
> /* If all offline, queue the work on an unbound CPU. */
> if (unlikely(cpu > rnp->grphi))
> cpu = WORK_CPU_UNBOUND;
> queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
> - preempt_enable();
> rnp->exp_need_flush = true;
> }
>
> @@ -507,6 +506,7 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
> rcu_for_each_leaf_node(rsp, rnp)
> if (rnp->exp_need_flush)
> flush_work(&rnp->rew.rew_work);
> + cpus_read_unlock();
> }
>
> static void synchronize_sched_expedited_wait(struct rcu_state *rsp)
commit 5214cbbfe6a5d6b92c76c4e411a049fe57245d4a
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue Sep 11 08:57:48 2018 -0700
rcu: Stop expedited grace periods from relying on stop-machine
The CPU-selection code in sync_rcu_exp_select_cpus() disables preemption
to prevent the cpu_online_mask from changing. However, this relies on
the stop-machine mechanism in the CPU-hotplug offline code, which is not
desirable (it would be good to someday remove the stop-machine mechanism).
This commit therefore instead uses the relevant leaf rcu_node structure's
->ffmask, which has a bit set for all CPUs that are fully functional.
A given CPU's bit is cleared very early during offline processing by
rcutree_offline_cpu() and set very late during online processsing by
rcutree_online_cpu(). Therefore, if a CPU's bit is set in this mask, and
preemption is disabled, we have to be before the synchronize_sched() in
the CPU-hotplug offline code, which means that the CPU is guaranteed to be
workqueue-ready throughout the duration of the enclosing preempt_disable()
region of code.
This also has the side-effect of using WORK_CPU_UNBOUND if all the CPUs for
this leaf rcu_node structure are offline, which is an acceptable difference
in behavior.
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 8d18c1014e2b..e669ccf3751b 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -450,10 +450,12 @@ static void sync_rcu_exp_select_cpus(smp_call_func_t func)
}
INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
preempt_disable();
- cpu = cpumask_next(rnp->grplo - 1, cpu_online_mask);
+ cpu = find_next_bit(&rnp->ffmask, BITS_PER_LONG, -1);
/* If all offline, queue the work on an unbound CPU. */
- if (unlikely(cpu > rnp->grphi))
+ if (unlikely(cpu > rnp->grphi - rnp->grplo))
cpu = WORK_CPU_UNBOUND;
+ else
+ cpu += rnp->grplo;
queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
preempt_enable();
rnp->exp_need_flush = true;
next prev parent reply other threads:[~2018-09-11 16:05 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-10 13:56 [PATCH] rcu: Use cpus_read_lock() while looking at cpu_online_mask Sebastian Andrzej Siewior
2018-09-11 16:05 ` Paul E. McKenney [this message]
2018-09-11 16:21 ` Sebastian Andrzej Siewior
2018-09-11 17:02 ` Paul E. McKenney
2018-09-19 20:55 ` Tejun Heo
2018-09-19 22:11 ` Paul E. McKenney
2018-10-12 18:41 ` Sebastian Andrzej Siewior
2018-10-13 13:48 ` Paul E. McKenney
2018-10-15 14:42 ` Sebastian Andrzej Siewior
2018-10-15 15:07 ` Boqun Feng
2018-10-15 15:09 ` Sebastian Andrzej Siewior
2018-10-15 15:33 ` Boqun Feng
2018-10-15 16:36 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180911160532.GJ4225@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=bigeasy@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).