All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Paul Mackerras <paulus@samba.org>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@kernel.org>, Oleg Nesterov <oleg@redhat.com>,
	Tejun Heo <tj@kernel.org>, Michel Lespinasse <walken@google.com>,
	ego@linux.vnet.ibm.com,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>,
	Thomas Gleixner <tglx@linutronix.de>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: Deadlock between cpu_hotplug_begin and cpu_add_remove_lock
Date: Wed, 22 Jan 2014 14:00:40 +0530	[thread overview]
Message-ID: <52DF81B0.7020700@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140122055239.GA29418@iris.ozlabs.ibm.com>

Hi Paul,

On 01/22/2014 11:22 AM, Paul Mackerras wrote:
> This arises out of a report from a tester that offlining a CPU never
> finished on a system they were testing.  This was on a POWER8 running
> a 3.10.x kernel, but the issue is still present in mainline AFAICS.
> 
> What I found when I looked at the system was this:
> 
> * There was a ppc64_cpu process stuck inside cpu_hotplug_begin(),
>   called from _cpu_down(), from cpu_down().  This process was holding
>   the cpu_add_remove_lock mutex, since cpu_down() calls
>   cpu_maps_update_begin() before calling _cpu_down().  It was stuck
>   there because cpu_hotplug.refcount == 1.
> 
> * There was a mdadm process trying to acquire the cpu_add_remove_lock
>   mutex inside register_cpu_notifier(), called from
>   raid5_alloc_percpu() in drivers/md/raid5.c.  That process had
>   previously called get_online_cpus, which is why cpu_hotplug.refcount
>   was 1.
> 
> Result: deadlock.
> 
> Thus it seems that the following code is not safe:
> 
> 	get_online_cpus();
> 	register_cpu_notifier(&...);
> 	put_online_cpus();
>

Yes, this is a known problem, and I had proposed an elaborate solution
some time ago: https://lkml.org/lkml/2012/3/1/39
But that won't work for all cases, so that solution is a no-go.

If we forget the CPU_POST_DEAD stage for a moment, we can just replace the
calls to cpu_maps_update_begin/done() with get/put_online_cpus() in both
register_cpu_notifier() as well as unregister_cpu_notifier(). After all,
the callback registration code needs to synchronize only with the actual
hotplug operations, and not the update of cpu-maps. So they don't really
need to acquire the cpu_add_remove_lock.

However, CPU_POST_DEAD notifications run with the hotplug lock dropped.
So we can't simply replace cpu_add_remove_lock with hotplug lock in the
registration routines, because notifier invocations and notifier registration
needs to be synchronized.

Hmm...
 
> There are a few different places that do that sort of thing; besides
> drivers/md/raid5.c, there are instances in arch/x86/kernel/cpu,
> arch/x86/oprofile, drivers/cpufreq/acpi-cpufreq.c,
> drivers/oprofile/nmi_timer_int.c and kernel/trace/ring_buffer.c.
> 
> My question is this: is it reasonable to call register_cpu_notifier
> inside a get/put_online_cpus block?

Ideally, we would want that to work. Because there is no other race-free
way of registering a notifier.

>  If so, the deadlock needs to be
> fixed; if not, the callers need to be fixed, and the restriction
> should be documented.

Fixing the callers is a last resort. I'm thinking of ways to fix the
deadlock itself, and allow the callers to call register_cpu_notifier
within a get/put_online_cpus() block...

Regards,
Srivatsa S. Bhat



  reply	other threads:[~2014-01-22  8:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-22  5:52 Deadlock between cpu_hotplug_begin and cpu_add_remove_lock Paul Mackerras
2014-01-22  8:30 ` Srivatsa S. Bhat [this message]
2014-01-22  9:16   ` Srivatsa S. Bhat
2014-01-22 19:18     ` Oleg Nesterov
2014-01-22 19:58       ` Srivatsa S. Bhat
2014-01-23 17:02         ` Oleg Nesterov
2014-01-28 14:32           ` Srivatsa S. Bhat
2014-01-23  2:29     ` Rusty Russell
2014-01-23  5:36       ` Srivatsa S. Bhat
2014-01-23 23:01         ` Rusty Russell
2014-01-28 14:36           ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52DF81B0.7020700@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.