All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: paulmck@linux.vnet.ibm.com, ego@in.ibm.com,
	rusty@rustcorp.com.au, mingo@elte.hu,
	linux-kernel@vger.kernel.org, peterz@infradead.org,
	oleg@redhat.com, dipankar@in.ibm.com
Subject: Re: [PATCH -mm resend] cpuhotplug: introduce try_get_online_cpus() take 3
Date: Tue, 9 Jun 2009 18:42:38 -0700	[thread overview]
Message-ID: <20090609184238.06b38c3e.akpm@linux-foundation.org> (raw)
In-Reply-To: <4A2F08D6.6060309@cn.fujitsu.com>

On Wed, 10 Jun 2009 09:13:58 +0800 Lai Jiangshan <laijs@cn.fujitsu.com> wrote:

> It's for -mm tree.
> 
> It also works for mainline if you apply this at first:
> http://lkml.org/lkml/2009/2/17/58
> 
> Subject: [PATCH -mm] cpuhotplug: introduce try_get_online_cpus() take 3
> 
> get_online_cpus() is a typically coarsely granular lock.
> It's a source of ABBA or ABBCCA... deadlock.
> 
> Thanks to the CPU notifiers, Some subsystem's global lock will
> be required after cpu_hotplug.lock. Subsystem's global lock
> is coarsely granular lock too, thus a lot's of lock in kernel
> should be required after cpu_hotplug.lock(if we need
> cpu_hotplug.lock held too)
> 
> Otherwise it may come to a ABBA deadlock like this:
> 
> thread 1                                      |        thread 2
> _cpu_down()                                   |  Lock a-kernel-lock.
>   cpu_hotplug_begin()                         |
>     mutex_lock(&cpu_hotplug.lock)             |
>   __raw_notifier_call_chain(CPU_DOWN_PREPARE) |  get_online_cpus()
> ------------------------------------------------------------------------
>     Lock a-kernel-lock.(wait thread2)         |    mutex_lock(&cpu_hotplug.lock)
>                                                    (wait thread 1)

uh, OK.

> But CPU online/offline are happened very rarely, get_online_cpus()
> returns success quickly in all probability.
> So it's an asinine behavior that get_online_cpus() is not allowed
> to be required after we had held "a-kernel-lock".
> 
> To dispel the ABBA deadlock, this patch introduces
> try_get_online_cpus(). It returns fail very rarely. It gives the
> caller a chance to select an alternative way to finish works,
> instead of sleeping or deadlock.

I still think we should really avoid having to do this.  trylocks are
nasty things.

Looking at the above, one would think that a correct fix would be to fix
the bug in "thread 2": take the locks in the correct order?  As
try_get_online_cpus() doesn't actually have any callers, it's hard to
take that thought any further.


  reply	other threads:[~2009-06-10  1:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-29  8:29 [PATCH 1/2] cpuhotplug: use rw_semaphore for cpu_hotplug Lai Jiangshan
2009-05-29 20:23 ` Andrew Morton
2009-05-29 21:07   ` Oleg Nesterov
2009-05-29 21:17     ` Oleg Nesterov
2009-06-01  1:04       ` Lai Jiangshan
2009-06-01  0:52     ` Lai Jiangshan
2009-06-01  2:22       ` Lai Jiangshan
2009-05-30  1:53 ` Paul E. McKenney
2009-05-30  4:37   ` Gautham R Shenoy
2009-06-04  6:58     ` [PATCH] cpuhotplug: introduce try_get_online_cpus() take 2 Lai Jiangshan
2009-06-04 20:49       ` Oleg Nesterov
2009-06-05  1:32         ` Lai Jiangshan
2009-06-05  2:14           ` Oleg Nesterov
2009-06-05 15:37       ` Paul E. McKenney
2009-06-08  2:36         ` Lai Jiangshan
2009-06-08  4:19         ` Gautham R Shenoy
2009-06-08 14:25           ` Paul E. McKenney
2009-06-09 12:07             ` [PATCH -mm] cpuhotplug: introduce try_get_online_cpus() take 3 Lai Jiangshan
2009-06-09 19:34               ` Andrew Morton
2009-06-09 23:47                 ` Paul E. McKenney
2009-06-10  1:13                   ` [PATCH -mm resend] " Lai Jiangshan
2009-06-10  1:42                     ` Andrew Morton [this message]
2009-06-11  8:41                       ` Lai Jiangshan
2009-06-11 18:50                         ` Paul E. McKenney
2009-06-15  4:04                           ` Gautham R Shenoy
2009-06-10  0:57                 ` [PATCH -mm] " Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090609184238.06b38c3e.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=dipankar@in.ibm.com \
    --cc=ego@in.ibm.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.