All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Gu Zheng <guz.fnst@cn.fujitsu.com>,
	Li Zefan <lizefan@huawei.com>, Tejun Heo <tj@kernel.org>
Subject: [PATCH 3.4 04/22] cpuset,mempolicy: fix sleeping function called from invalid context
Date: Tue, 15 Jul 2014 16:17:09 -0700	[thread overview]
Message-ID: <20140715231622.257893561@linuxfoundation.org> (raw)
In-Reply-To: <20140715231622.121816073@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gu Zheng <guz.fnst@cn.fujitsu.com>

commit 391acf970d21219a2a5446282d3b20eace0c0d7a upstream.

When runing with the kernel(3.15-rc7+), the follow bug occurs:
[ 9969.258987] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
[ 9969.359906] in_atomic(): 1, irqs_disabled(): 0, pid: 160655, name: python
[ 9969.441175] INFO: lockdep is turned off.
[ 9969.488184] CPU: 26 PID: 160655 Comm: python Tainted: G       A      3.15.0-rc7+ #85
[ 9969.581032] Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB, BIOS PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012
[ 9969.706052]  ffffffff81a20e60 ffff8803e941fbd0 ffffffff8162f523 ffff8803e941fd18
[ 9969.795323]  ffff8803e941fbe0 ffffffff8109995a ffff8803e941fc58 ffffffff81633e6c
[ 9969.884710]  ffffffff811ba5dc ffff880405c6b480 ffff88041fdd90a0 0000000000002000
[ 9969.974071] Call Trace:
[ 9970.003403]  [<ffffffff8162f523>] dump_stack+0x4d/0x66
[ 9970.065074]  [<ffffffff8109995a>] __might_sleep+0xfa/0x130
[ 9970.130743]  [<ffffffff81633e6c>] mutex_lock_nested+0x3c/0x4f0
[ 9970.200638]  [<ffffffff811ba5dc>] ? kmem_cache_alloc+0x1bc/0x210
[ 9970.272610]  [<ffffffff81105807>] cpuset_mems_allowed+0x27/0x140
[ 9970.344584]  [<ffffffff811b1303>] ? __mpol_dup+0x63/0x150
[ 9970.409282]  [<ffffffff811b1385>] __mpol_dup+0xe5/0x150
[ 9970.471897]  [<ffffffff811b1303>] ? __mpol_dup+0x63/0x150
[ 9970.536585]  [<ffffffff81068c86>] ? copy_process.part.23+0x606/0x1d40
[ 9970.613763]  [<ffffffff810bf28d>] ? trace_hardirqs_on+0xd/0x10
[ 9970.683660]  [<ffffffff810ddddf>] ? monotonic_to_bootbased+0x2f/0x50
[ 9970.759795]  [<ffffffff81068cf0>] copy_process.part.23+0x670/0x1d40
[ 9970.834885]  [<ffffffff8106a598>] do_fork+0xd8/0x380
[ 9970.894375]  [<ffffffff81110e4c>] ? __audit_syscall_entry+0x9c/0xf0
[ 9970.969470]  [<ffffffff8106a8c6>] SyS_clone+0x16/0x20
[ 9971.030011]  [<ffffffff81642009>] stub_clone+0x69/0x90
[ 9971.091573]  [<ffffffff81641c29>] ? system_call_fastpath+0x16/0x1b

The cause is that cpuset_mems_allowed() try to take
mutex_lock(&callback_mutex) under the rcu_read_lock(which was hold in
__mpol_dup()). And in cpuset_mems_allowed(), the access to cpuset is
under rcu_read_lock, so in __mpol_dup, we can reduce the rcu_read_lock
protection region to protect the access to cpuset only in
current_cpuset_is_being_rebound(). So that we can avoid this bug.

This patch is a temporary solution that just addresses the bug
mentioned above, can not fix the long-standing issue about cpuset.mems
rebinding on fork():

"When the forker's task_struct is duplicated (which includes
 ->mems_allowed) and it races with an update to cpuset_being_rebound
 in update_tasks_nodemask() then the task's mems_allowed doesn't get
 updated. And the child task's mems_allowed can be wrong if the
 cpuset's nodemask changes before the child has been added to the
 cgroup's tasklist."

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/cpuset.c |    8 +++++++-
 mm/mempolicy.c  |    2 --
 2 files changed, 7 insertions(+), 3 deletions(-)

--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1152,7 +1152,13 @@ done:
 
 int current_cpuset_is_being_rebound(void)
 {
-	return task_cs(current) == cpuset_being_rebound;
+	int ret;
+
+	rcu_read_lock();
+	ret = task_cs(current) == cpuset_being_rebound;
+	rcu_read_unlock();
+
+	return ret;
 }
 
 static int update_relax_domain_level(struct cpuset *cs, s64 val)
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1991,7 +1991,6 @@ struct mempolicy *__mpol_dup(struct memp
 	} else
 		*new = *old;
 
-	rcu_read_lock();
 	if (current_cpuset_is_being_rebound()) {
 		nodemask_t mems = cpuset_mems_allowed(current);
 		if (new->flags & MPOL_F_REBINDING)
@@ -1999,7 +1998,6 @@ struct mempolicy *__mpol_dup(struct memp
 		else
 			mpol_rebind_policy(new, &mems, MPOL_REBIND_ONCE);
 	}
-	rcu_read_unlock();
 	atomic_set(&new->refcnt, 1);
 	return new;
 }



  parent reply	other threads:[~2014-07-16  0:22 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-15 23:17 [PATCH 3.4 00/22] 3.4.99-stable review Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 01/22] usb: option: Add ID for Telewell TW-LTE 4G v2 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 02/22] USB: cp210x: add support for Corsair usb dongle Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 03/22] USB: ftdi_sio: Add extra PID Greg Kroah-Hartman
2014-07-15 23:17 ` Greg Kroah-Hartman [this message]
2014-07-15 23:17 ` [PATCH 3.4 05/22] hwmon: (amc6821) Fix permissions for temp2_input Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 06/22] hwmon: (adm1029) Ensure the fan_div cache is updated in set_fan_div Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 07/22] powerpc/perf: Never program book3s PMCs with values >= 0x80000000 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 08/22] ext4: clarify error count warning messages Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 10/22] tracing: Remove ftrace_stop/start() from reading the trace file Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 11/22] rtmutex: Fix deadlock detector for real Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 12/22] rtmutex: Detect changes in the pi lock chain Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 13/22] rtmutex: Handle deadlock detection smarter Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 14/22] rtmutex: Plug slow unlock race Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 15/22] Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option" Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 16/22] x86-64, espfix: Dont leak bits 31:16 of %esp returning to 16-bit stack Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 17/22] x86, espfix: Move espfix definitions into a separate header file Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 18/22] x86, espfix: Fix broken header guard Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 19/22] x86, espfix: Make espfix64 a Kconfig option, fix UML Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 20/22] x86, espfix: Make it possible to disable 16-bit support Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 21/22] x86, ioremap: Speed up check for RAM pages Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.4 22/22] ACPI / battery: Retry to get battery information if failed during probing Greg Kroah-Hartman
2014-07-16  4:23 ` [PATCH 3.4 00/22] 3.4.99-stable review Guenter Roeck
2014-07-16 23:10 ` Greg Kroah-Hartman
2014-07-17 13:25 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140715231622.257893561@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=guz.fnst@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.