All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sasha.levin@oracle.com>
To: Davidlohr Bueso <dave@stgolabs.net>, Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Dave Jones <davej@codemonkey.org.uk>,
	jason.low2@hp.com, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: sched: softlockups in multi_cpu_stop
Date: Fri, 06 Mar 2015 13:02:50 -0500	[thread overview]
Message-ID: <54F9EBCA.1060300@oracle.com> (raw)
In-Reply-To: <1425662342.19505.41.camel@stgolabs.net>

On 03/06/2015 12:19 PM, Davidlohr Bueso wrote:
>> diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
>> > index 1c0d11e8ce34..e4ad019e23f5 100644
>> > --- a/kernel/locking/rwsem-xadd.c
>> > +++ b/kernel/locking/rwsem-xadd.c
>> > @@ -298,23 +298,30 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
>> >  static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
>> >  {
>> >  	struct task_struct *owner;
>> > -	bool on_cpu = false;
>> > +	bool ret = true;
>> >  
>> >  	if (need_resched())
>> >  		return false;
>> >  
>> >  	rcu_read_lock();
>> >  	owner = ACCESS_ONCE(sem->owner);
>> > -	if (owner)
>> > -		on_cpu = owner->on_cpu;
>> > -	rcu_read_unlock();
>> > +	if (!owner) {
>> > +		long count = ACCESS_ONCE(sem->count);
>> > +		/*
>> > +		 * If sem->owner is not set, yet we have just recently entered the
>> > +		 * slowpath with the lock being active, then there is a possibility
>> > +		 * reader(s) may have the lock. To be safe, bail spinning in these
>> > +		 * situations.
>> > +		 */
>> > +		if (count & RWSEM_ACTIVE_MASK)
>> > +			ret = false;
>> > +		goto done;
> Hmmm so the lockup would be due to this (when owner is non-nil the patch
> has no effect), telling users to spin instead of sleep -- _except_ for
> this condition. And when spinning we're always checking for need_resched
> to be safe. So even if this function was completely bogus, we'd end up
> needlessly spinning but I'm surprised about the lockup. Maybe coffee
> will make things clearer.

There's always the possibility that bisect went wrong. I did it twice, but
since I don't have a sure way of reproducing it I was basing my good/bad
decisions on whether I saw it within a reasonable amount of time.

I can go redo that again if you suspect that that commit is not the cause.


Thanks,
Sasha


  reply	other threads:[~2015-03-06 18:03 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-02  7:45 sched: softlockups in multi_cpu_stop Sasha Levin
     [not found] ` <CAMiJ5CVWvUhGK=MWYB_CTNs901p=jsT4i5gkWTaHih7qdQdkFQ@mail.gmail.com>
2015-03-04  5:44   ` Rafael David Tinoco
2015-03-06 11:27 ` Sasha Levin
2015-03-06 12:32   ` Ingo Molnar
2015-03-06 14:34     ` Rafael David Tinoco
2015-03-06 14:45       ` Sasha Levin
2015-03-06 15:46         ` Sasha Levin
2015-03-06 17:19     ` Davidlohr Bueso
2015-03-06 18:02       ` Sasha Levin [this message]
2015-03-06 21:59         ` Sasha Levin
2015-03-06 18:57       ` Jason Low
2015-03-06 19:05         ` Linus Torvalds
2015-03-06 19:20           ` Davidlohr Bueso
2015-03-06 19:32             ` Linus Torvalds
2015-03-06 19:45               ` Davidlohr Bueso
2015-03-06 19:55               ` Davidlohr Bueso
2015-03-06 20:00                 ` Davidlohr Bueso
2015-03-06 21:42                 ` Linus Torvalds
2015-03-06 19:29           ` Jason Low
2015-03-06 21:12             ` Jason Low
2015-03-06 21:24               ` Linus Torvalds
2015-03-07  1:53                 ` Jason Low
2015-03-06 22:15               ` Davidlohr Bueso
2015-03-07  1:55                 ` Ming Lei
2015-03-07  2:07                   ` Davidlohr Bueso
2015-03-07  2:10                     ` Ming Lei
2015-03-07  2:26                       ` Davidlohr Bueso
2015-03-07  2:29                         ` Davidlohr Bueso
2015-03-07  2:55                           ` Ming Lei
2015-03-07  3:10                             ` Davidlohr Bueso
2015-03-07  3:19                               ` Ming Lei
2015-03-07  3:41                                 ` Davidlohr Bueso
2015-03-07  2:56                       ` Jason Low
2015-03-07  3:08                         ` Ming Lei
2015-03-07  3:10                           ` Davidlohr Bueso
2015-03-07  3:17                           ` Jason Low
2015-03-07  3:39                             ` Ming Lei
2015-03-07  3:53                               ` Jason Low
2015-03-07  1:58                 ` Jason Low
2015-03-07  4:31               ` Jason Low
2015-03-07  4:44                 ` Davidlohr Bueso
2015-03-07  6:45                   ` Jason Low
2015-03-07  5:54                 ` Ming Lei
2015-03-07  6:57                   ` Jason Low

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54F9EBCA.1060300@oracle.com \
    --to=sasha.levin@oracle.com \
    --cc=dave@stgolabs.net \
    --cc=davej@codemonkey.org.uk \
    --cc=jason.low2@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.