All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: "Li, Aubrey" <aubrey.li@linux.intel.com>
Cc: paulmck@kernel.org, linux-kernel@vger.kernel.org,
	vpillai <vpillai@digitalocean.com>,
	Aaron Lu <aaron.lwe@gmail.com>,
	Aubrey Li <aubrey.intel@gmail.com>,
	peterz@infradead.org, Ben Segall <bsegall@google.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Steven Rostedt <rostedt@goodmis.org>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
Date: Mon, 23 Mar 2020 11:21:26 -0400	[thread overview]
Message-ID: <20200323152126.GA141027@google.com> (raw)
In-Reply-To: <f77b9432-933c-a9fe-5541-437cf0094a65@linux.intel.com>

On Mon, Mar 23, 2020 at 02:58:18PM +0800, Li, Aubrey wrote:
> On 2020/3/14 8:30, Paul E. McKenney wrote:
> > On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
> >> rcu_read_unlock() can incur an infrequent deadlock in
> >> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
> >>
> >> This fixes the following spinlock recursion observed when testing the
> >> core scheduling patches on PREEMPT=y kernel on ChromeOS:
> >>
> >> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
> >>
> > 
> > The original could indeed deadlock, and this would avoid that deadlock.
> > (The commit to solve this deadlock is sadly not yet in mainline.)
> > 
> > Acked-by: Paul E. McKenney <paulmck@kernel.org>
> 
> I saw this in dmesg with this patch, is it expected?
> 
> [  117.000905] =============================
> [  117.000907] WARNING: suspicious RCU usage
> [  117.000911] 5.5.7+ #160 Not tainted
> [  117.000913] -----------------------------
> [  117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
> [  117.000918] 
>                other info that might help us debug this:

Sigh, this is because for_each_domain() expects rcu_read_lock(). From an RCU
PoV, the code is correct (warning doesn't cause any issue).

To silence warning, we could replace the rcu_read_lock_sched() in my patch with:
preempt_disable();
rcu_read_lock();

and replace the unlock with:

rcu_read_unlock();
preempt_enable();

That should both take care of both the warning and the scheduler-related
deadlock. Thoughts?

Does that fix the warning for you? 

thanks,

 - Joel

> 
> [  117.000921] 
>                rcu_scheduler_active = 2, debug_locks = 1
> [  117.000923] 1 lock held by swapper/52/0:
> [  117.000925]  #0: ffffffff82670960 (rcu_read_lock_sched){....}, at: sched_core_balance+0x5/0x700
> [  117.000937] 
>                stack backtrace:
> [  117.000940] CPU: 52 PID: 0 Comm: swapper/52 Kdump: loaded Not tainted 5.5.7+ #160
> [  117.000943] Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0412.020920172159 02/09/2017
> [  117.000945] Call Trace:
> [  117.000955]  dump_stack+0x86/0xcb
> [  117.000962]  sched_core_balance+0x634/0x700
> [  117.000982]  __balance_callback+0x49/0xa0
> [  117.000990]  __schedule+0x1416/0x1620
> [  117.001000]  ? lockdep_hardirqs_off+0xa0/0xe0
> [  117.001005]  ? _raw_spin_unlock_irqrestore+0x41/0x70
> [  117.001024]  schedule_idle+0x28/0x40
> [  117.001030]  do_idle+0x17e/0x2a0
> [  117.001041]  cpu_startup_entry+0x19/0x20
> [  117.001048]  start_secondary+0x16c/0x1c0
> [  117.001055]  secondary_startup_64+0xa4/0xb0
> 
> > 
> >> ---
> >>  kernel/sched/core.c | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index 3045bd50e249..037e8f2e2686 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -4735,7 +4735,7 @@ static void sched_core_balance(struct rq *rq)
> >>  	struct sched_domain *sd;
> >>  	int cpu = cpu_of(rq);
> >>  
> >> -	rcu_read_lock();
> >> +	rcu_read_lock_sched();
> >>  	raw_spin_unlock_irq(rq_lockp(rq));
> >>  	for_each_domain(cpu, sd) {
> >>  		if (!(sd->flags & SD_LOAD_BALANCE))
> >> @@ -4748,7 +4748,7 @@ static void sched_core_balance(struct rq *rq)
> >>  			break;
> >>  	}
> >>  	raw_spin_lock_irq(rq_lockp(rq));
> >> -	rcu_read_unlock();
> >> +	rcu_read_unlock_sched();
> >>  }
> >>  
> >>  static DEFINE_PER_CPU(struct callback_head, core_balance_head);
> >> -- 
> >> 2.25.1.481.gfbce0eb801-goog
> >>
> 

  reply	other threads:[~2020-03-23 15:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-13 23:29 [PATCH] sched: Use RCU-sched in core-scheduling balancing logic Joel Fernandes (Google)
2020-03-14  0:30 ` Paul E. McKenney
2020-03-23  6:58   ` Li, Aubrey
2020-03-23 15:21     ` Joel Fernandes [this message]
2020-03-24  3:01       ` Li, Aubrey
2020-03-24 13:30         ` Paul E. McKenney
2020-03-24 15:12           ` Paul E. McKenney
2020-03-24 18:49         ` Joel Fernandes
2020-03-25  0:40           ` Li, Aubrey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200323152126.GA141027@google.com \
    --to=joel@joelfernandes.org \
    --cc=aaron.lwe@gmail.com \
    --cc=aubrey.intel@gmail.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vpillai@digitalocean.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.