From: David Hildenbrand <dahi@linux.vnet.ibm.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: linux-kernel@vger.kernel.org, heiko.carstens@de.ibm.com,
borntraeger@de.ibm.com, rafael.j.wysocki@intel.com,
paulmck@linux.vnet.ibm.com, peterz@infradead.org, bp@suse.de,
jkosina@suse.cz
Subject: Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
Date: Wed, 10 Dec 2014 20:21:36 +0100 [thread overview]
Message-ID: <20141210202136.2c41d678@thinkpad-w530> (raw)
In-Reply-To: <20141210175055.GA11802@redhat.com>
> On 12/10, David Hildenbrand wrote:
> >
> > @@ -127,20 +119,16 @@ void put_online_cpus(void)
> > {
> > if (cpu_hotplug.active_writer == current)
> > return;
> > - if (!mutex_trylock(&cpu_hotplug.lock)) {
> > - atomic_inc(&cpu_hotplug.puts_pending);
> > - cpuhp_lock_release();
> > - return;
> > - }
> > -
> > - if (WARN_ON(!cpu_hotplug.refcount))
> > - cpu_hotplug.refcount++; /* try to fix things up */
> >
> > - if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> > - wake_up_process(cpu_hotplug.active_writer);
> > - mutex_unlock(&cpu_hotplug.lock);
> > - cpuhp_lock_release();
> > + if (atomic_dec_and_test(&cpu_hotplug.refcount) &&
> > + waitqueue_active(&cpu_hotplug.wq))
> > + wake_up(&cpu_hotplug.wq);
>
> OK, waitqueue_active() looks safe... prepare_to_wait() has a barrier.
>
> > void cpu_hotplug_begin(void)
> > {
> > + DEFINE_WAIT(wait);
> > +
> > cpu_hotplug.active_writer = current;
> >
> > - cpuhp_lock_acquire();
> > for (;;) {
> > + cpuhp_lock_acquire();
>
> not sure I understand why did you move cpuhp_lock_acquire() into
> the loop, but this is minor.
Well I got some lockdep issues and this way I was able to solve them.
(complain about same thread that called cpu_hotplug_begin() calling
put_online_cpus(), so we have to correctly tell lockdep when we get an release
the lock).
So I guess I also need that in the loop, or am I wrong (due to
cpuhp_lock_release())?
>
> > mutex_lock(&cpu_hotplug.lock);
> > - apply_puts_pending(1);
> > - if (likely(!cpu_hotplug.refcount))
> > + prepare_to_wait(&cpu_hotplug.wq, &wait, TASK_UNINTERRUPTIBLE);
> > + if (likely(!atomic_read(&cpu_hotplug.refcount)))
> > break;
> > - __set_current_state(TASK_UNINTERRUPTIBLE);
> > mutex_unlock(&cpu_hotplug.lock);
> > + cpuhp_lock_release();
> > schedule();
> > }
> > +
> > + finish_wait(&cpu_hotplug.wq, &wait);
> > }
>
> This is subjective, but how about
>
> static bool xxx(void)
> {
> mutex_lock(&cpu_hotplug.lock);
> if (atomic_read(&cpu_hotplug.refcount) == 0)
> return true;
> mutex_unlock(&cpu_hotplug.lock);
> return false;
> }
>
> void cpu_hotplug_begin(void)
> {
> cpu_hotplug.active_writer = current;
>
> cpuhp_lock_acquire();
> wait_event(&cpu_hotplug.wq, xxx());
> }
>
> instead?
>
What I don't like about that suggestion is that the mutex_lock() happens in
another level of indirection, so by looking at cpu_hotplug_begin() it isn't
obvious that that lock remains locked after this function has been called.
On the other hand this is really a compact one (+ possibly lockdep
annotations) :) .
> Oleg.
>
It is important that we do the state change to TASK_UNINTERRUPTIBLE prior to
checking for the condition.
Is it guaranteed with wait_event() that things like the following won't happen?
1. CPU1 wakes up the wq (refcount == 0)
2. CPU2 calls get_online_cpus() and increments refcount. (refcount == 1)
2. CPU3 executes xxx() up to "return false;" and gets scheduled away
3. CPU2 calls put_online_cpus(), decrementing the refcount (refcount == 0)
-> waitqueue not active -> no wake up
4. CPU3 continues executing and sleeps
-> refcount == 0 but writer is not woken up
Saying, does wait_event() take care wakeups while executing xxx()?
(w.g. activating the wait queue, setting TASK_UNINTERRUPTIBLE just before
calling xxx())
In my code, this is guaranteed by calling
prepare_to_wait(&cpu_hotplug.wq, &wait, TASK_UNINTERRUPTIBLE); prior to checking for the condition.
If that is guaranteed, this would work. Will verify that tomorrow.
Thanks a lot!
David
next prev parent reply other threads:[~2014-12-10 19:21 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-10 13:22 [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock David Hildenbrand
2014-12-10 13:26 ` David Hildenbrand
2014-12-10 16:00 ` Paul E. McKenney
2014-12-10 19:23 ` David Hildenbrand
2014-12-10 17:50 ` Oleg Nesterov
2014-12-10 19:21 ` David Hildenbrand [this message]
2014-12-11 9:56 ` David Hildenbrand
2014-12-12 8:46 ` David Hildenbrand
2014-12-14 19:20 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141210202136.2c41d678@thinkpad-w530 \
--to=dahi@linux.vnet.ibm.com \
--cc=borntraeger@de.ibm.com \
--cc=bp@suse.de \
--cc=heiko.carstens@de.ibm.com \
--cc=jkosina@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox