stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Gu Zheng <guz.fnst@cn.fujitsu.com>,
	Li Zefan <lizefan@huawei.com>, Tejun Heo <tj@kernel.org>
Subject: [PATCH 3.14 10/66] cpuset,mempolicy: fix sleeping function called from invalid context
Date: Tue, 15 Jul 2014 16:17:04 -0700	[thread overview]
Message-ID: <20140715231702.538107463@linuxfoundation.org> (raw)
In-Reply-To: <20140715231702.156040999@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gu Zheng <guz.fnst@cn.fujitsu.com>

commit 391acf970d21219a2a5446282d3b20eace0c0d7a upstream.

When runing with the kernel(3.15-rc7+), the follow bug occurs:
[ 9969.258987] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
[ 9969.359906] in_atomic(): 1, irqs_disabled(): 0, pid: 160655, name: python
[ 9969.441175] INFO: lockdep is turned off.
[ 9969.488184] CPU: 26 PID: 160655 Comm: python Tainted: G       A      3.15.0-rc7+ #85
[ 9969.581032] Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB, BIOS PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012
[ 9969.706052]  ffffffff81a20e60 ffff8803e941fbd0 ffffffff8162f523 ffff8803e941fd18
[ 9969.795323]  ffff8803e941fbe0 ffffffff8109995a ffff8803e941fc58 ffffffff81633e6c
[ 9969.884710]  ffffffff811ba5dc ffff880405c6b480 ffff88041fdd90a0 0000000000002000
[ 9969.974071] Call Trace:
[ 9970.003403]  [<ffffffff8162f523>] dump_stack+0x4d/0x66
[ 9970.065074]  [<ffffffff8109995a>] __might_sleep+0xfa/0x130
[ 9970.130743]  [<ffffffff81633e6c>] mutex_lock_nested+0x3c/0x4f0
[ 9970.200638]  [<ffffffff811ba5dc>] ? kmem_cache_alloc+0x1bc/0x210
[ 9970.272610]  [<ffffffff81105807>] cpuset_mems_allowed+0x27/0x140
[ 9970.344584]  [<ffffffff811b1303>] ? __mpol_dup+0x63/0x150
[ 9970.409282]  [<ffffffff811b1385>] __mpol_dup+0xe5/0x150
[ 9970.471897]  [<ffffffff811b1303>] ? __mpol_dup+0x63/0x150
[ 9970.536585]  [<ffffffff81068c86>] ? copy_process.part.23+0x606/0x1d40
[ 9970.613763]  [<ffffffff810bf28d>] ? trace_hardirqs_on+0xd/0x10
[ 9970.683660]  [<ffffffff810ddddf>] ? monotonic_to_bootbased+0x2f/0x50
[ 9970.759795]  [<ffffffff81068cf0>] copy_process.part.23+0x670/0x1d40
[ 9970.834885]  [<ffffffff8106a598>] do_fork+0xd8/0x380
[ 9970.894375]  [<ffffffff81110e4c>] ? __audit_syscall_entry+0x9c/0xf0
[ 9970.969470]  [<ffffffff8106a8c6>] SyS_clone+0x16/0x20
[ 9971.030011]  [<ffffffff81642009>] stub_clone+0x69/0x90
[ 9971.091573]  [<ffffffff81641c29>] ? system_call_fastpath+0x16/0x1b

The cause is that cpuset_mems_allowed() try to take
mutex_lock(&callback_mutex) under the rcu_read_lock(which was hold in
__mpol_dup()). And in cpuset_mems_allowed(), the access to cpuset is
under rcu_read_lock, so in __mpol_dup, we can reduce the rcu_read_lock
protection region to protect the access to cpuset only in
current_cpuset_is_being_rebound(). So that we can avoid this bug.

This patch is a temporary solution that just addresses the bug
mentioned above, can not fix the long-standing issue about cpuset.mems
rebinding on fork():

"When the forker's task_struct is duplicated (which includes
 ->mems_allowed) and it races with an update to cpuset_being_rebound
 in update_tasks_nodemask() then the task's mems_allowed doesn't get
 updated. And the child task's mems_allowed can be wrong if the
 cpuset's nodemask changes before the child has been added to the
 cgroup's tasklist."

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/cpuset.c |    8 +++++++-
 mm/mempolicy.c  |    2 --
 2 files changed, 7 insertions(+), 3 deletions(-)

--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1236,7 +1236,13 @@ done:
 
 int current_cpuset_is_being_rebound(void)
 {
-	return task_cs(current) == cpuset_being_rebound;
+	int ret;
+
+	rcu_read_lock();
+	ret = task_cs(current) == cpuset_being_rebound;
+	rcu_read_unlock();
+
+	return ret;
 }
 
 static int update_relax_domain_level(struct cpuset *cs, s64 val)
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2170,7 +2170,6 @@ struct mempolicy *__mpol_dup(struct memp
 	} else
 		*new = *old;
 
-	rcu_read_lock();
 	if (current_cpuset_is_being_rebound()) {
 		nodemask_t mems = cpuset_mems_allowed(current);
 		if (new->flags & MPOL_F_REBINDING)
@@ -2178,7 +2177,6 @@ struct mempolicy *__mpol_dup(struct memp
 		else
 			mpol_rebind_policy(new, &mems, MPOL_REBIND_ONCE);
 	}
-	rcu_read_unlock();
 	atomic_set(&new->refcnt, 1);
 	return new;
 }



  parent reply	other threads:[~2014-07-15 23:17 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-15 23:16 [PATCH 3.14 00/66] 3.14.13-stable review Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 01/66] usb: option: Add ID for Telewell TW-LTE 4G v2 Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 02/66] USB: cp210x: add support for Corsair usb dongle Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 03/66] USB: ftdi_sio: Add extra PID Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 04/66] USB: serial: ftdi_sio: Add Infineon Triboard Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 05/66] iio: ti_am335x_adc: Fix: Use same step id at FIFOs both ends Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 06/66] parisc: add serial ports of C8000/1GHz machine to hardware database Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 07/66] parisc: fix fanotify_mark() syscall on 32bit compat kernel Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 08/66] parisc,metag: Do not hardcode maximum userspace stack size Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 09/66] workqueue: fix dev_set_uevent_suppress() imbalance Greg Kroah-Hartman
2014-07-15 23:17 ` Greg Kroah-Hartman [this message]
2014-07-15 23:17 ` [PATCH 3.14 11/66] workqueue: zero cpumask of wq_numa_possible_cpumask on init Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 12/66] i8k: Fix non-SMP operation Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 13/66] thermal: hwmon: Make the check for critical temp valid consistent Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 14/66] hwmon: (amc6821) Fix permissions for temp2_input Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 15/66] hwmon: (emc2103) Clamp limits instead of bailing out Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 16/66] hwmon: (adm1031) Fix writes to limit registers Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 17/66] hwmon: (adm1029) Ensure the fan_div cache is updated in set_fan_div Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 18/66] hwmon: (adm1021) Fix cache problem when writing temperature limits Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 19/66] Revert "ACPI / AC: Remove ACs proc directory." Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 20/66] ACPI / resources: only reject zero length resources based at address zero Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 21/66] ACPI / EC: Avoid race condition related to advance_transaction() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 22/66] ACPI / EC: Add asynchronous command byte write support Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 23/66] ACPI / EC: Remove duplicated ec_wait_ibf0() waiter Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 24/66] ACPI / EC: Fix race condition in ec_transaction_completed() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 25/66] powerpc/perf: Never program book3s PMCs with values >= 0x80000000 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 26/66] powerpc/perf: Add PPMU_ARCH_207S define Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 27/66] powerpc/perf: Clear MMCR2 when enabling PMU Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 28/66] cpufreq: Makefile: fix compilation for davinci platform Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 29/66] crypto: sha512_ssse3 - fix byte count to bit count conversion Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 30/66] crypto: caam - fix memleak in caam_jr module Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 31/66] arm64: implement TASK_SIZE_OF Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 32/66] phy: core: Fix error path in phy_create() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 33/66] clk: spear3xx: Use proper control register offset Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 34/66] clk: s2mps11: Fix double free corruption during driver unbind Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 35/66] clk: qcom: HDMI source sel is 3 not 2 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 36/66] Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 37/66] dm io: fix a race condition in the wake up code for sync_io Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 38/66] dm: allocate a special workqueue for deferred device removal Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 39/66] intel_pstate: Fix setting VID Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 40/66] intel_pstate: dont touch turbo bit if turbo disabled or unavailable Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 41/66] intel_pstate: Update documentation of {max,min}_perf_pct sysfs files Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 42/66] intel_pstate: Set CPU number before accessing MSRs Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 43/66] PCI: Fix unaligned access in AF transaction pending test Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 44/66] ext4: fix unjournalled bg descriptor while initializing inode bitmap Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 45/66] ext4: clarify error count warning messages Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 46/66] ext4: clarify ext4_error message in ext4_mb_generate_buddy_error() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 47/66] ext4: disable synchronous transaction batching if max_batch_time==0 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 48/66] ext4: fix a potential deadlock in __ext4_es_shrink() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 49/66] drm/radeon/dpm: Reenabling SS on Cayman Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 50/66] drm/radeon: fix typo in ci_stop_dpm() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 51/66] drm/radeon: fix typo in golden register setup on evergreen Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 54/66] DMA, CMA: fix possible memory leak Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 55/66] ring-buffer: Check if buffer exists before polling Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 56/66] drivers/rtc/rtc-puv3.c: remove "&dev->" for typo issue Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 57/66] drivers/rtc/rtc-puv3.c: use dev_dbg() instead of dev_debug() " Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 58/66] powerpc: Disable RELOCATABLE for COMPILE_TEST with PPC64 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 59/66] Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option" Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 60/66] x86-64, espfix: Dont leak bits 31:16 of %esp returning to 16-bit stack Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 61/66] x86, espfix: Move espfix definitions into a separate header file Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 62/66] x86, espfix: Fix broken header guard Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 63/66] x86, espfix: Make espfix64 a Kconfig option, fix UML Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 64/66] x86, espfix: Make it possible to disable 16-bit support Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 65/66] x86, ioremap: Speed up check for RAM pages Greg Kroah-Hartman
2014-07-15 23:18 ` [PATCH 3.14 66/66] ACPI / battery: Retry to get battery information if failed during probing Greg Kroah-Hartman
2014-07-16  4:27 ` [PATCH 3.14 00/66] 3.14.13-stable review Guenter Roeck
2014-07-16 23:09 ` Greg Kroah-Hartman
2014-07-17 13:23 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140715231702.538107463@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=guz.fnst@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).