From: David Hildenbrand <dahi@linux.vnet.ibm.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: linux-kernel@vger.kernel.org, heiko.carstens@de.ibm.com,
borntraeger@de.ibm.com, rafael.j.wysocki@intel.com,
paulmck@linux.vnet.ibm.com, peterz@infradead.org, bp@suse.de,
jkosina@suse.cz
Subject: Re: [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock
Date: Wed, 10 Dec 2014 20:21:36 +0100 [thread overview]
Message-ID: <20141210202136.2c41d678@thinkpad-w530> (raw)
In-Reply-To: <20141210175055.GA11802@redhat.com>
> On 12/10, David Hildenbrand wrote:
> >
> > @@ -127,20 +119,16 @@ void put_online_cpus(void)
> > {
> > if (cpu_hotplug.active_writer == current)
> > return;
> > - if (!mutex_trylock(&cpu_hotplug.lock)) {
> > - atomic_inc(&cpu_hotplug.puts_pending);
> > - cpuhp_lock_release();
> > - return;
> > - }
> > -
> > - if (WARN_ON(!cpu_hotplug.refcount))
> > - cpu_hotplug.refcount++; /* try to fix things up */
> >
> > - if (!--cpu_hotplug.refcount && unlikely(cpu_hotplug.active_writer))
> > - wake_up_process(cpu_hotplug.active_writer);
> > - mutex_unlock(&cpu_hotplug.lock);
> > - cpuhp_lock_release();
> > + if (atomic_dec_and_test(&cpu_hotplug.refcount) &&
> > + waitqueue_active(&cpu_hotplug.wq))
> > + wake_up(&cpu_hotplug.wq);
>
> OK, waitqueue_active() looks safe... prepare_to_wait() has a barrier.
>
> > void cpu_hotplug_begin(void)
> > {
> > + DEFINE_WAIT(wait);
> > +
> > cpu_hotplug.active_writer = current;
> >
> > - cpuhp_lock_acquire();
> > for (;;) {
> > + cpuhp_lock_acquire();
>
> not sure I understand why did you move cpuhp_lock_acquire() into
> the loop, but this is minor.
Well I got some lockdep issues and this way I was able to solve them.
(complain about same thread that called cpu_hotplug_begin() calling
put_online_cpus(), so we have to correctly tell lockdep when we get an release
the lock).
So I guess I also need that in the loop, or am I wrong (due to
cpuhp_lock_release())?
>
> > mutex_lock(&cpu_hotplug.lock);
> > - apply_puts_pending(1);
> > - if (likely(!cpu_hotplug.refcount))
> > + prepare_to_wait(&cpu_hotplug.wq, &wait, TASK_UNINTERRUPTIBLE);
> > + if (likely(!atomic_read(&cpu_hotplug.refcount)))
> > break;
> > - __set_current_state(TASK_UNINTERRUPTIBLE);
> > mutex_unlock(&cpu_hotplug.lock);
> > + cpuhp_lock_release();
> > schedule();
> > }
> > +
> > + finish_wait(&cpu_hotplug.wq, &wait);
> > }
>
> This is subjective, but how about
>
> static bool xxx(void)
> {
> mutex_lock(&cpu_hotplug.lock);
> if (atomic_read(&cpu_hotplug.refcount) == 0)
> return true;
> mutex_unlock(&cpu_hotplug.lock);
> return false;
> }
>
> void cpu_hotplug_begin(void)
> {
> cpu_hotplug.active_writer = current;
>
> cpuhp_lock_acquire();
> wait_event(&cpu_hotplug.wq, xxx());
> }
>
> instead?
>
What I don't like about that suggestion is that the mutex_lock() happens in
another level of indirection, so by looking at cpu_hotplug_begin() it isn't
obvious that that lock remains locked after this function has been called.
On the other hand this is really a compact one (+ possibly lockdep
annotations) :) .
> Oleg.
>
It is important that we do the state change to TASK_UNINTERRUPTIBLE prior to
checking for the condition.
Is it guaranteed with wait_event() that things like the following won't happen?
1. CPU1 wakes up the wq (refcount == 0)
2. CPU2 calls get_online_cpus() and increments refcount. (refcount == 1)
2. CPU3 executes xxx() up to "return false;" and gets scheduled away
3. CPU2 calls put_online_cpus(), decrementing the refcount (refcount == 0)
-> waitqueue not active -> no wake up
4. CPU3 continues executing and sleeps
-> refcount == 0 but writer is not woken up
Saying, does wait_event() take care wakeups while executing xxx()?
(w.g. activating the wait queue, setting TASK_UNINTERRUPTIBLE just before
calling xxx())
In my code, this is guaranteed by calling
prepare_to_wait(&cpu_hotplug.wq, &wait, TASK_UNINTERRUPTIBLE); prior to checking for the condition.
If that is guaranteed, this would work. Will verify that tomorrow.
Thanks a lot!
David
next prev parent reply other threads:[~2014-12-10 19:21 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-10 13:22 [PATCH v4] CPU hotplug: active_writer not woken up in some cases - deadlock David Hildenbrand
2014-12-10 13:26 ` David Hildenbrand
2014-12-10 16:00 ` Paul E. McKenney
2014-12-10 19:23 ` David Hildenbrand
2014-12-10 17:50 ` Oleg Nesterov
2014-12-10 19:21 ` David Hildenbrand [this message]
2014-12-11 9:56 ` David Hildenbrand
2014-12-12 8:46 ` David Hildenbrand
2014-12-14 19:20 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141210202136.2c41d678@thinkpad-w530 \
--to=dahi@linux.vnet.ibm.com \
--cc=borntraeger@de.ibm.com \
--cc=bp@suse.de \
--cc=heiko.carstens@de.ibm.com \
--cc=jkosina@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.