All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: miaox@cn.fujitsu.com
Cc: Paul Menage <menage@google.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Max Krasnyansky <maxk@qualcomm.com>,
	Linux-Kernel <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [RESEND][PATCH] cpuset: fix possible deadlock in async_rebuild_sched_domains
Date: Mon, 26 Jan 2009 20:03:07 -0800	[thread overview]
Message-ID: <20090126200307.833b087a.akpm@linux-foundation.org> (raw)
In-Reply-To: <497540BE.4070408@cn.fujitsu.com>

On Tue, 20 Jan 2009 11:10:54 +0800 Miao Xie <miaox@cn.fujitsu.com> wrote:

> Lockdep reported some possible circular locking info when we tested cpuset on
> NUMA/fake NUMA box.
> 
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.29-rc1-00224-ga652504 #111
> -------------------------------------------------------
> bash/2968 is trying to acquire lock:
>  (events){--..}, at: [<ffffffff8024c8cd>] flush_work+0x24/0xd8
> 
> but task is already holding lock:
>  (cgroup_mutex){--..}, at: [<ffffffff8026ad1e>] cgroup_lock_live_group+0x12/0x29
> 
> which lock already depends on the new lock.
> ......
> -------------------------------------------------------
> 
> Steps to reproduce:
> # mkdir /dev/cpuset
> # mount -t cpuset xxx /dev/cpuset
> # mkdir /dev/cpuset/0
> # echo 0 > /dev/cpuset/0/cpus
> # echo 0 > /dev/cpuset/0/mems
> # echo 1 > /dev/cpuset/0/memory_migrate
> # cat /dev/zero > /dev/null &
> # echo $! > /dev/cpuset/0/tasks
> 
> This is because async_rebuild_sched_domains has the following lock sequence:
> run_workqueue(async_rebuild_sched_domains)
> 	-> do_rebuild_sched_domains -> cgroup_lock
> 
> But, attaching tasks when memory_migrate is set has following:
> cgroup_lock_live_group(cgroup_tasks_write)
> 	-> do_migrate_pages -> flush_work

Where is this flush_work() call?  lru_add_drain_all()->schedule_on_each_cpu()?

If so, and if that is the only such callsite then we could/should
rework this code to use work_on_cpu(), if we manage to fix that thing.

It would be somewhat inefficient.  It would be better if work_on_cpu()
were to take a cpumask argument, and avoid blocking behind each CPU one
at a time.  But first things first.

> This patch fixes it by using a separate workqueue thread.

<wonders when RESERVED_PIDS became a logarithm>


      parent reply	other threads:[~2009-01-27  4:04 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-20  3:10 [RESEND][PATCH] cpuset: fix possible deadlock in async_rebuild_sched_domains Miao Xie
2009-01-20  7:24 ` Ingo Molnar
2009-01-27  4:03 ` Andrew Morton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090126200307.833b087a.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxk@qualcomm.com \
    --cc=menage@google.com \
    --cc=miaox@cn.fujitsu.com \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.