From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>,
Matt Fleming <matt@codeblueprint.co.uk>,
Ingo Molnar <mingo@kernel.org>,
linux-kernel@vger.kernel.org, Michal Hocko <mhocko@suse.com>
Subject: Re: cpu stopper threads and load balancing leads to deadlock
Date: Thu, 3 May 2018 10:18:50 -0700 [thread overview]
Message-ID: <20180503171850.GL26088@linux.vnet.ibm.com> (raw)
In-Reply-To: <20180503164508.GG12217@hirez.programming.kicks-ass.net>
On Thu, May 03, 2018 at 06:45:08PM +0200, Peter Zijlstra wrote:
> On Thu, May 03, 2018 at 09:12:31AM -0700, Paul E. McKenney wrote:
> > On Thu, May 03, 2018 at 04:44:50PM +0200, Peter Zijlstra wrote:
> > > On Thu, May 03, 2018 at 04:16:55PM +0200, Mike Galbraith wrote:
> > > > On Thu, 2018-05-03 at 15:56 +0200, Peter Zijlstra wrote:
> > > > > On Thu, May 03, 2018 at 03:32:39PM +0200, Mike Galbraith wrote:
> > > > >
> > > > > > Dang. With $subject fix applied as well..
> > > > >
> > > > > That's a NO then... :-(
> > > >
> > > > Could say who cares about oddball offline wakeup stat. <cringe>
> > >
> > > Yeah, nobody.. but I don't want to have to change the wakeup code to
> > > deal with this if at all possible. That'd just add conditions that are
> > > 'always' false, except in this exceedingly rare circumstance.
> > >
> > > So ideally we manage to tell RCU that it needs to pay attention while
> > > we're doing this here thing, which is what I thought RCU_NONIDLE() was
> > > about.
> >
> > One straightforward approach would be to provide a arch-specific
> > Kconfig option that tells notify_cpu_starting() not to bother invoking
> > rcu_cpu_starting(). Then x86 selects this Kconfig option and invokes
> > rcu_cpu_starting() itself early enough to avoid splats.
> >
> > See the (untested, probably does not even build) patch below.
> >
> > I have no idea where to insert either the "select" or the call to
> > rcu_cpu_starting(), so I left those out. I know that putting the
> > call too early will cause trouble, but I have no idea what constitutes
> > "too early". :-/
>
> Something like so perhaps? Mike, can you play around with that? Could
> burn your granny and eat your cookies.
>
>
> diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
> index 7468de429087..07360523c3ce 100644
> --- a/arch/x86/kernel/cpu/mtrr/main.c
> +++ b/arch/x86/kernel/cpu/mtrr/main.c
> @@ -793,6 +793,9 @@ void mtrr_ap_init(void)
>
> if (!use_intel() || mtrr_aps_delayed_init)
> return;
> +
> + rcu_cpu_starting(smp_processor_id());
> +
> /*
> * Ideally we should hold mtrr_mutex here to avoid mtrr entries
> * changed, but this routine will be called in cpu boot time,
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 2a734692a581..4dab46950fdb 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3775,6 +3775,8 @@ int rcutree_dead_cpu(unsigned int cpu)
> return 0;
> }
>
> +static DEFINE_PER_CPU(int, rcu_cpu_started);
> +
> /*
> * Mark the specified CPU as being online so that subsequent grace periods
> * (both expedited and normal) will wait on it. Note that this means that
> @@ -3796,6 +3798,11 @@ void rcu_cpu_starting(unsigned int cpu)
> struct rcu_node *rnp;
> struct rcu_state *rsp;
>
> + if (per_cpu(rcu_cpu_started, cpu))
I would log a non-splat dmesg the first time this happened, just for my
future sanity, but otherwise looks fine. I am a bit concerned about
calls to rcu_cpu_starting() getting sprinkled all through the code.
Or am I being excessively paranoid?
Thanx, Paul
> + return;
> +
> + per_cpu(rcu_cpu_started, cpu) = 1;
> +
> for_each_rcu_flavor(rsp) {
> rdp = per_cpu_ptr(rsp->rda, cpu);
> rnp = rdp->mynode;
> @@ -3852,6 +3859,8 @@ void rcu_report_dead(unsigned int cpu)
> preempt_enable();
> for_each_rcu_flavor(rsp)
> rcu_cleanup_dying_idle_cpu(cpu, rsp);
> +
> + per_cpu(rcu_cpu_started, cpu) = 0;
> }
>
> /* Migrate the dead CPU's callbacks to the current CPU. */
>
next prev parent reply other threads:[~2018-05-03 17:17 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-17 14:21 cpu stopper threads and load balancing leads to deadlock Matt Fleming
2018-04-18 5:47 ` Mike Galbraith
2018-04-19 5:38 ` Mike Galbraith
2018-04-20 9:50 ` Peter Zijlstra
2018-04-24 13:33 ` Matt Fleming
2018-05-03 12:12 ` Mike Galbraith
2018-05-03 12:28 ` Peter Zijlstra
2018-05-03 12:40 ` Mike Galbraith
2018-05-03 12:49 ` Peter Zijlstra
2018-05-03 13:32 ` Mike Galbraith
2018-05-03 13:56 ` Peter Zijlstra
2018-05-03 14:16 ` Mike Galbraith
2018-05-03 14:44 ` Peter Zijlstra
2018-05-03 16:12 ` Paul E. McKenney
2018-05-03 16:45 ` Peter Zijlstra
2018-05-03 17:18 ` Paul E. McKenney [this message]
2018-05-03 17:54 ` Peter Zijlstra
2018-05-03 18:24 ` Paul E. McKenney
2018-05-04 3:38 ` Mike Galbraith
2018-05-15 4:30 ` Mike Galbraith
2018-05-17 14:03 ` Paul E. McKenney
2018-05-17 14:10 ` Mike Galbraith
2018-05-17 14:23 ` Peter Zijlstra
2018-05-17 14:56 ` Paul E. McKenney
2018-05-22 17:05 ` Paul E. McKenney
2018-05-03 14:39 ` Paul E. McKenney
2018-05-03 14:52 ` Peter Zijlstra
2018-05-03 9:24 ` [tip:sched/urgent] stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock tip-bot for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180503171850.GL26088@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=mhocko@suse.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.