public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, John Stultz <john.stultz@linaro.org>,
	Mel Gorman <mgorman@suse.de>, Juri Lelli <juri.lelli@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Li Zefan <lizefan@huawei.com>, Tejun Heo <tj@kernel.org>
Subject: [PATCH 3.4 51/60] cpuset: Fix memory allocator deadlock
Date: Mon,  2 Dec 2013 11:06:32 -0800	[thread overview]
Message-ID: <20131202190341.438467330@linuxfoundation.org> (raw)
In-Reply-To: <20131202190330.152596462@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <peterz@infradead.org>

commit 0fc0287c9ed1ffd3706f8b4d9b314aa102ef1245 upstream.

Juri hit the below lockdep report:

[    4.303391] ======================================================
[    4.303392] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[    4.303394] 3.12.0-dl-peterz+ #144 Not tainted
[    4.303395] ------------------------------------------------------
[    4.303397] kworker/u4:3/689 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[    4.303399]  (&p->mems_allowed_seq){+.+...}, at: [<ffffffff8114e63c>] new_slab+0x6c/0x290
[    4.303417]
[    4.303417] and this task is already holding:
[    4.303418]  (&(&q->__queue_lock)->rlock){..-...}, at: [<ffffffff812d2dfb>] blk_execute_rq_nowait+0x5b/0x100
[    4.303431] which would create a new lock dependency:
[    4.303432]  (&(&q->__queue_lock)->rlock){..-...} -> (&p->mems_allowed_seq){+.+...}
[    4.303436]

[    4.303898] the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock:
[    4.303918] -> (&p->mems_allowed_seq){+.+...} ops: 2762 {
[    4.303922]    HARDIRQ-ON-W at:
[    4.303923]                     [<ffffffff8108ab9a>] __lock_acquire+0x65a/0x1ff0
[    4.303926]                     [<ffffffff8108cbe3>] lock_acquire+0x93/0x140
[    4.303929]                     [<ffffffff81063dd6>] kthreadd+0x86/0x180
[    4.303931]                     [<ffffffff816ded6c>] ret_from_fork+0x7c/0xb0
[    4.303933]    SOFTIRQ-ON-W at:
[    4.303933]                     [<ffffffff8108abcc>] __lock_acquire+0x68c/0x1ff0
[    4.303935]                     [<ffffffff8108cbe3>] lock_acquire+0x93/0x140
[    4.303940]                     [<ffffffff81063dd6>] kthreadd+0x86/0x180
[    4.303955]                     [<ffffffff816ded6c>] ret_from_fork+0x7c/0xb0
[    4.303959]    INITIAL USE at:
[    4.303960]                    [<ffffffff8108a884>] __lock_acquire+0x344/0x1ff0
[    4.303963]                    [<ffffffff8108cbe3>] lock_acquire+0x93/0x140
[    4.303966]                    [<ffffffff81063dd6>] kthreadd+0x86/0x180
[    4.303969]                    [<ffffffff816ded6c>] ret_from_fork+0x7c/0xb0
[    4.303972]  }

Which reports that we take mems_allowed_seq with interrupts enabled. A
little digging found that this can only be from
cpuset_change_task_nodemask().

This is an actual deadlock because an interrupt doing an allocation will
hit get_mems_allowed()->...->__read_seqcount_begin(), which will spin
forever waiting for the write side to complete.

Cc: John Stultz <john.stultz@linaro.org>
Cc: Mel Gorman <mgorman@suse.de>
Reported-by: Juri Lelli <juri.lelli@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Juri Lelli <juri.lelli@gmail.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/cpuset.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -983,8 +983,10 @@ static void cpuset_change_task_nodemask(
 	need_loop = task_has_mempolicy(tsk) ||
 			!nodes_intersects(*newmems, tsk->mems_allowed);
 
-	if (need_loop)
+	if (need_loop) {
+		local_irq_disable();
 		write_seqcount_begin(&tsk->mems_allowed_seq);
+	}
 
 	nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
 	mpol_rebind_task(tsk, newmems, MPOL_REBIND_STEP1);
@@ -992,8 +994,10 @@ static void cpuset_change_task_nodemask(
 	mpol_rebind_task(tsk, newmems, MPOL_REBIND_STEP2);
 	tsk->mems_allowed = *newmems;
 
-	if (need_loop)
+	if (need_loop) {
 		write_seqcount_end(&tsk->mems_allowed_seq);
+		local_irq_enable();
+	}
 
 	task_unlock(tsk);
 }



  parent reply	other threads:[~2013-12-02 19:07 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-02 19:05 [PATCH 3.4 00/60] 3.4.72-stable review Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 01/60] ARM: sa11x0/assabet: ensure CS2 is configured appropriately Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 02/60] ARM: integrator_cp: Set LCD{0,1} enable lines when turning on CLCD Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 03/60] Staging: tidspbridge: disable driver Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 04/60] backlight: atmel-pwm-bl: fix reported brightness Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 05/60] ASoC: ak4642: prevent un-necessary changes to SG_SL1 Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 06/60] ASoC: wm8962: Turn on regcache_cache_only before disabling regulator Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 07/60] ASoC: blackfin: Fix missing break Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 08/60] alarmtimer: return EINVAL instead of ENOTSUPP if rtcdev doesnt exist Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 09/60] devpts: plug the memory leak in kill_sb Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 10/60] can: flexcan: fix flexcan_chip_start() on imx6 Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 11/60] libata: Fix display of sata speed Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 12/60] drivers/libata: Set max sector to 65535 for Slimtype DVD A DS8A9SH drive Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 13/60] vsprintf: check real user/group id for %pK Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 14/60] rtlwifi: rtl8192se: Fix wrong assignment Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 15/60] rtlwifi: rtl8192cu: Fix more pointer arithmetic errors Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 16/60] ahci: disabled FBS prior to issuing software reset Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 17/60] ahci: add Marvell 9230 to the AHCI PCI device list Greg Kroah-Hartman
2013-12-02 19:05 ` [PATCH 3.4 18/60] iscsi-target: fix extract_param to handle buffer length corner case Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 19/60] iscsi-target: chap auth shouldnt match username with trailing garbage Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 20/60] IB/ipath: Convert ipath_user_sdma_pin_pages() to use get_user_pages_fast() Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 21/60] loop: fix crash if blk_alloc_queue fails Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 22/60] mtd: nand: hack ONFI for non-power-of-2 dimensions Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 23/60] mtd: map: fixed bug in 64-bit systems Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 24/60] mtd: gpmi: fix kernel BUG due to racing DMA operations Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 25/60] ext4: avoid bh leak in retry path of ext4_expand_extra_isize_ea() Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 26/60] xen/blkback: fix reference counting Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 27/60] staging: vt6656: [BUG] Fix for TX USB resets from vendors driver Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 28/60] rtlwifi: rtl8192de: Fix incorrect signal strength for unassociated AP Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 29/60] rtlwifi: rtl8192se: " Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 30/60] rtlwifi: rtl8192cu: " Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 31/60] qeth: avoid buffer overflow in snmp ioctl Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 32/60] rt2400pci: fix RSSI read Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 33/60] dm: allocate buffer for messages with small number of arguments using GFP_NOIO Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 34/60] PM / hibernate: Avoid overflow in hibernate_preallocate_memory() Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 35/60] mwifiex: correct packet length for packets from SDIO interface Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 36/60] audit: printk USER_AVC messages when audit isnt enabled Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 37/60] audit: use nlmsg_len() to get message payload length Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 38/60] audit: fix info leak in AUDIT_GET requests Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 39/60] PCI: Remove duplicate pci_disable_device() from pcie_portdrv_remove() Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 40/60] selinux: correct locking in selinux_netlbl_socket_connect) Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 41/60] avr32: setup crt for early panic() Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 42/60] avr32: fix out-of-range jump in large kernels Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 43/60] prism54: set netdev type to "wlan" Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 44/60] drm/ttm: Handle in-memory region copies Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 45/60] drm/i915: flush cursors harder Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 46/60] drm/nouveau: when bailing out of a pushbuf ioctl, do not remove previous fence Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 47/60] drm/radeon/si: fix define for MC_SEQ_TRAIN_WAKEUP_CNTL Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 48/60] radeon: workaround pinning failure on low ram gpu Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 49/60] md: fix calculation of stacking limits on level change Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 50/60] powerpc/signals: Improved mark VSX not saved with small contexts fix Greg Kroah-Hartman
2013-12-02 19:06 ` Greg Kroah-Hartman [this message]
2013-12-02 19:06 ` [PATCH 3.4 52/60] ALSA: hda/realtek - Set pcbeep amp for ALC668 Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 53/60] tracing: Allow events to have NULL strings Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 54/60] Input: i8042 - add PNP modaliases Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 55/60] KVM: perform an invalid memslot step for gpa base change Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 56/60] KVM: Fix iommu map/unmap to handle memory slot moves Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 57/60] ftrace: Fix function graph with loading of modules Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 58/60] media: lirc_zilog: Dont use dynamic static allocation Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 59/60] HID: roccat: fix Coverity CID 141438 Greg Kroah-Hartman
2013-12-02 19:06 ` [PATCH 3.4 60/60] HID: apple: option to swap the Option ("Alt") and Command ("Flag") keys Greg Kroah-Hartman
2013-12-03  2:50 ` [PATCH 3.4 00/60] 3.4.72-stable review Guenter Roeck
2013-12-03  3:04   ` Greg Kroah-Hartman
2013-12-03 21:56 ` Shuah Khan
2013-12-04 10:23 ` Satoru Takeuchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131202190341.438467330@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=john.stultz@linaro.org \
    --cc=juri.lelli@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=mgorman@suse.de \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox