From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-kernel@vger.kernel.org, Boqun Feng <boqun.feng@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
tglx@linutronix.de, Steven Rostedt <rostedt@goodmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Lai Jiangshan <jiangshanlai@gmail.com>
Subject: Re: [PATCH] rcu: Use cpus_read_lock() while looking at cpu_online_mask
Date: Tue, 11 Sep 2018 09:05:32 -0700 [thread overview]
Message-ID: <20180911160532.GJ4225@linux.vnet.ibm.com> (raw)
In-Reply-To: <20180910135615.tr3cvipwbhq6xug4@linutronix.de>
On Mon, Sep 10, 2018 at 03:56:16PM +0200, Sebastian Andrzej Siewior wrote:
> It was possible that sync_rcu_exp_select_cpus() enqueued something on
> CPU0 while CPU0 was offline. Such a work item wouldn't be processed
> until CPU0 gets back online. This problem was addressed in commit
> fcc6354365015 ("rcu: Make expedited GPs handle CPU 0 being offline"). I
> don't think the issue fully addressed.
>
> Assume grplo = 0 and grphi = 7 and sync_rcu_exp_select_cpus() is invoked
> on CPU1. The preempt_disable() section on CPU1 won't ensure that CPU0
> remains online between looking at cpu_online_mask and invoking
> queue_work_on() on CPU1.
>
> Use cpus_read_lock() to ensure that `cpu' is not going down between
> looking at cpu_online_mask at invoking queue_work_on() and waiting for
> its completion. It is added around the loop + flush_work() which is
> similar to work_on_cpu_safe() (and we can have multiple jobs running on
> NUMA systems).
Is this experimental or theoretical? If theoretical, the counter-theory
is that the stop-machine processing prevents any of the cpu_online_mask
bits from changing, though, yes, we would like to get rid of the
stop-machine processing. So either way, yes, the current state could
use some improvement.
But one problem with the patch below is that sync_rcu_exp_select_cpus()
can be called while the cpu_hotplug_lock is write-held. Or is that
somehow OK these days? Assuming not, how about the (untested) patch
below?
Thanx, Paul
> Fixes: fcc6354365015 ("rcu: Make expedited GPs handle CPU 0 being
> offline")
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> kernel/rcu/tree_exp.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index 01b6ddeb4f050..a104cf91e6b90 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -479,6 +479,7 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
> sync_exp_reset_tree(rsp);
> trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("select"));
>
> + cpus_read_lock();
> /* Schedule work for each leaf rcu_node structure. */
> rcu_for_each_leaf_node(rsp, rnp) {
> rnp->exp_need_flush = false;
> @@ -493,13 +494,11 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
> continue;
> }
> INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
> - preempt_disable();
> cpu = cpumask_next(rnp->grplo - 1, cpu_online_mask);
> /* If all offline, queue the work on an unbound CPU. */
> if (unlikely(cpu > rnp->grphi))
> cpu = WORK_CPU_UNBOUND;
> queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
> - preempt_enable();
> rnp->exp_need_flush = true;
> }
>
> @@ -507,6 +506,7 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
> rcu_for_each_leaf_node(rsp, rnp)
> if (rnp->exp_need_flush)
> flush_work(&rnp->rew.rew_work);
> + cpus_read_unlock();
> }
>
> static void synchronize_sched_expedited_wait(struct rcu_state *rsp)
commit 5214cbbfe6a5d6b92c76c4e411a049fe57245d4a
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date: Tue Sep 11 08:57:48 2018 -0700
rcu: Stop expedited grace periods from relying on stop-machine
The CPU-selection code in sync_rcu_exp_select_cpus() disables preemption
to prevent the cpu_online_mask from changing. However, this relies on
the stop-machine mechanism in the CPU-hotplug offline code, which is not
desirable (it would be good to someday remove the stop-machine mechanism).
This commit therefore instead uses the relevant leaf rcu_node structure's
->ffmask, which has a bit set for all CPUs that are fully functional.
A given CPU's bit is cleared very early during offline processing by
rcutree_offline_cpu() and set very late during online processsing by
rcutree_online_cpu(). Therefore, if a CPU's bit is set in this mask, and
preemption is disabled, we have to be before the synchronize_sched() in
the CPU-hotplug offline code, which means that the CPU is guaranteed to be
workqueue-ready throughout the duration of the enclosing preempt_disable()
region of code.
This also has the side-effect of using WORK_CPU_UNBOUND if all the CPUs for
this leaf rcu_node structure are offline, which is an acceptable difference
in behavior.
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 8d18c1014e2b..e669ccf3751b 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -450,10 +450,12 @@ static void sync_rcu_exp_select_cpus(smp_call_func_t func)
}
INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
preempt_disable();
- cpu = cpumask_next(rnp->grplo - 1, cpu_online_mask);
+ cpu = find_next_bit(&rnp->ffmask, BITS_PER_LONG, -1);
/* If all offline, queue the work on an unbound CPU. */
- if (unlikely(cpu > rnp->grphi))
+ if (unlikely(cpu > rnp->grphi - rnp->grplo))
cpu = WORK_CPU_UNBOUND;
+ else
+ cpu += rnp->grplo;
queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
preempt_enable();
rnp->exp_need_flush = true;
next prev parent reply other threads:[~2018-09-11 16:05 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-10 13:56 [PATCH] rcu: Use cpus_read_lock() while looking at cpu_online_mask Sebastian Andrzej Siewior
2018-09-11 16:05 ` Paul E. McKenney [this message]
2018-09-11 16:21 ` Sebastian Andrzej Siewior
2018-09-11 17:02 ` Paul E. McKenney
2018-09-19 20:55 ` Tejun Heo
2018-09-19 22:11 ` Paul E. McKenney
2018-10-12 18:41 ` Sebastian Andrzej Siewior
2018-10-13 13:48 ` Paul E. McKenney
2018-10-15 14:42 ` Sebastian Andrzej Siewior
2018-10-15 15:07 ` Boqun Feng
2018-10-15 15:09 ` Sebastian Andrzej Siewior
2018-10-15 15:33 ` Boqun Feng
2018-10-15 16:36 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180911160532.GJ4225@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=bigeasy@linutronix.de \
--cc=boqun.feng@gmail.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.