From: Paul Mackerras <paulus@samba.org>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Subject: Deadlock between cpu_hotplug_begin and cpu_add_remove_lock
Date: Wed, 22 Jan 2014 16:52:39 +1100 [thread overview]
Message-ID: <20140122055239.GA29418@iris.ozlabs.ibm.com> (raw)
This arises out of a report from a tester that offlining a CPU never
finished on a system they were testing. This was on a POWER8 running
a 3.10.x kernel, but the issue is still present in mainline AFAICS.
What I found when I looked at the system was this:
* There was a ppc64_cpu process stuck inside cpu_hotplug_begin(),
called from _cpu_down(), from cpu_down(). This process was holding
the cpu_add_remove_lock mutex, since cpu_down() calls
cpu_maps_update_begin() before calling _cpu_down(). It was stuck
there because cpu_hotplug.refcount == 1.
* There was a mdadm process trying to acquire the cpu_add_remove_lock
mutex inside register_cpu_notifier(), called from
raid5_alloc_percpu() in drivers/md/raid5.c. That process had
previously called get_online_cpus, which is why cpu_hotplug.refcount
was 1.
Result: deadlock.
Thus it seems that the following code is not safe:
get_online_cpus();
register_cpu_notifier(&...);
put_online_cpus();
There are a few different places that do that sort of thing; besides
drivers/md/raid5.c, there are instances in arch/x86/kernel/cpu,
arch/x86/oprofile, drivers/cpufreq/acpi-cpufreq.c,
drivers/oprofile/nmi_timer_int.c and kernel/trace/ring_buffer.c.
My question is this: is it reasonable to call register_cpu_notifier
inside a get/put_online_cpus block? If so, the deadlock needs to be
fixed; if not, the callers need to be fixed, and the restriction
should be documented.
Regards,
Paul.
next reply other threads:[~2014-01-22 5:53 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-22 5:52 Paul Mackerras [this message]
2014-01-22 8:30 ` Deadlock between cpu_hotplug_begin and cpu_add_remove_lock Srivatsa S. Bhat
2014-01-22 9:16 ` Srivatsa S. Bhat
2014-01-22 19:18 ` Oleg Nesterov
2014-01-22 19:58 ` Srivatsa S. Bhat
2014-01-23 17:02 ` Oleg Nesterov
2014-01-28 14:32 ` Srivatsa S. Bhat
2014-01-23 2:29 ` Rusty Russell
2014-01-23 5:36 ` Srivatsa S. Bhat
2014-01-23 23:01 ` Rusty Russell
2014-01-28 14:36 ` Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140122055239.GA29418@iris.ozlabs.ibm.com \
--to=paulus@samba.org \
--cc=linux-kernel@vger.kernel.org \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.