public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>, Tejun Heo <tj@kernel.org>
Subject: [PATCH 3.14 11/66] workqueue: zero cpumask of wq_numa_possible_cpumask on init
Date: Tue, 15 Jul 2014 16:17:05 -0700	[thread overview]
Message-ID: <20140715231702.585842404@linuxfoundation.org> (raw)
In-Reply-To: <20140715231702.156040999@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

commit 5a6024f1604eef119cf3a6fa413fe0261a81a8f3 upstream.

When hot-adding and onlining CPU, kernel panic occurs, showing following
call trace.

  BUG: unable to handle kernel paging request at 0000000000001d08
  IP: [<ffffffff8114acfd>] __alloc_pages_nodemask+0x9d/0xb10
  PGD 0
  Oops: 0000 [#1] SMP
  ...
  Call Trace:
   [<ffffffff812b8745>] ? cpumask_next_and+0x35/0x50
   [<ffffffff810a3283>] ? find_busiest_group+0x113/0x8f0
   [<ffffffff81193bc9>] ? deactivate_slab+0x349/0x3c0
   [<ffffffff811926f1>] new_slab+0x91/0x300
   [<ffffffff815de95a>] __slab_alloc+0x2bb/0x482
   [<ffffffff8105bc1c>] ? copy_process.part.25+0xfc/0x14c0
   [<ffffffff810a3c78>] ? load_balance+0x218/0x890
   [<ffffffff8101a679>] ? sched_clock+0x9/0x10
   [<ffffffff81105ba9>] ? trace_clock_local+0x9/0x10
   [<ffffffff81193d1c>] kmem_cache_alloc_node+0x8c/0x200
   [<ffffffff8105bc1c>] copy_process.part.25+0xfc/0x14c0
   [<ffffffff81114d0d>] ? trace_buffer_unlock_commit+0x4d/0x60
   [<ffffffff81085a80>] ? kthread_create_on_node+0x140/0x140
   [<ffffffff8105d0ec>] do_fork+0xbc/0x360
   [<ffffffff8105d3b6>] kernel_thread+0x26/0x30
   [<ffffffff81086652>] kthreadd+0x2c2/0x300
   [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60
   [<ffffffff815f20ec>] ret_from_fork+0x7c/0xb0
   [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60

In my investigation, I found the root cause is wq_numa_possible_cpumask.
All entries of wq_numa_possible_cpumask is allocated by
alloc_cpumask_var_node(). And these entries are used without initializing.
So these entries have wrong value.

When hot-adding and onlining CPU, wq_update_unbound_numa() is called.
wq_update_unbound_numa() calls alloc_unbound_pwq(). And alloc_unbound_pwq()
calls get_unbound_pool(). In get_unbound_pool(), worker_pool->node is set
as follow:

3592         /* if cpumask is contained inside a NUMA node, we belong to that node */
3593         if (wq_numa_enabled) {
3594                 for_each_node(node) {
3595                         if (cpumask_subset(pool->attrs->cpumask,
3596                                            wq_numa_possible_cpumask[node])) {
3597                                 pool->node = node;
3598                                 break;
3599                         }
3600                 }
3601         }

But wq_numa_possible_cpumask[node] does not have correct cpumask. So, wrong
node is selected. As a result, kernel panic occurs.

By this patch, all entries of wq_numa_possible_cpumask are allocated by
zalloc_cpumask_var_node to initialize them. And the panic disappeared.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: bce903809ab3 ("workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/workqueue.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5027,7 +5027,7 @@ static void __init wq_numa_init(void)
 	BUG_ON(!tbl);
 
 	for_each_node(node)
-		BUG_ON(!alloc_cpumask_var_node(&tbl[node], GFP_KERNEL,
+		BUG_ON(!zalloc_cpumask_var_node(&tbl[node], GFP_KERNEL,
 				node_online(node) ? node : NUMA_NO_NODE));
 
 	for_each_possible_cpu(cpu) {



  parent reply	other threads:[~2014-07-16  0:05 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-15 23:16 [PATCH 3.14 00/66] 3.14.13-stable review Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 01/66] usb: option: Add ID for Telewell TW-LTE 4G v2 Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 02/66] USB: cp210x: add support for Corsair usb dongle Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 03/66] USB: ftdi_sio: Add extra PID Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 04/66] USB: serial: ftdi_sio: Add Infineon Triboard Greg Kroah-Hartman
2014-07-15 23:16 ` [PATCH 3.14 05/66] iio: ti_am335x_adc: Fix: Use same step id at FIFOs both ends Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 06/66] parisc: add serial ports of C8000/1GHz machine to hardware database Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 07/66] parisc: fix fanotify_mark() syscall on 32bit compat kernel Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 08/66] parisc,metag: Do not hardcode maximum userspace stack size Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 09/66] workqueue: fix dev_set_uevent_suppress() imbalance Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 10/66] cpuset,mempolicy: fix sleeping function called from invalid context Greg Kroah-Hartman
2014-07-15 23:17 ` Greg Kroah-Hartman [this message]
2014-07-15 23:17 ` [PATCH 3.14 12/66] i8k: Fix non-SMP operation Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 13/66] thermal: hwmon: Make the check for critical temp valid consistent Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 14/66] hwmon: (amc6821) Fix permissions for temp2_input Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 15/66] hwmon: (emc2103) Clamp limits instead of bailing out Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 16/66] hwmon: (adm1031) Fix writes to limit registers Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 17/66] hwmon: (adm1029) Ensure the fan_div cache is updated in set_fan_div Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 18/66] hwmon: (adm1021) Fix cache problem when writing temperature limits Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 19/66] Revert "ACPI / AC: Remove ACs proc directory." Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 20/66] ACPI / resources: only reject zero length resources based at address zero Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 21/66] ACPI / EC: Avoid race condition related to advance_transaction() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 22/66] ACPI / EC: Add asynchronous command byte write support Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 23/66] ACPI / EC: Remove duplicated ec_wait_ibf0() waiter Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 24/66] ACPI / EC: Fix race condition in ec_transaction_completed() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 25/66] powerpc/perf: Never program book3s PMCs with values >= 0x80000000 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 26/66] powerpc/perf: Add PPMU_ARCH_207S define Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 27/66] powerpc/perf: Clear MMCR2 when enabling PMU Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 28/66] cpufreq: Makefile: fix compilation for davinci platform Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 29/66] crypto: sha512_ssse3 - fix byte count to bit count conversion Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 30/66] crypto: caam - fix memleak in caam_jr module Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 31/66] arm64: implement TASK_SIZE_OF Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 32/66] phy: core: Fix error path in phy_create() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 33/66] clk: spear3xx: Use proper control register offset Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 34/66] clk: s2mps11: Fix double free corruption during driver unbind Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 35/66] clk: qcom: HDMI source sel is 3 not 2 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 36/66] Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 37/66] dm io: fix a race condition in the wake up code for sync_io Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 38/66] dm: allocate a special workqueue for deferred device removal Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 39/66] intel_pstate: Fix setting VID Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 40/66] intel_pstate: dont touch turbo bit if turbo disabled or unavailable Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 41/66] intel_pstate: Update documentation of {max,min}_perf_pct sysfs files Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 42/66] intel_pstate: Set CPU number before accessing MSRs Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 43/66] PCI: Fix unaligned access in AF transaction pending test Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 44/66] ext4: fix unjournalled bg descriptor while initializing inode bitmap Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 45/66] ext4: clarify error count warning messages Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 46/66] ext4: clarify ext4_error message in ext4_mb_generate_buddy_error() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 47/66] ext4: disable synchronous transaction batching if max_batch_time==0 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 48/66] ext4: fix a potential deadlock in __ext4_es_shrink() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 49/66] drm/radeon/dpm: Reenabling SS on Cayman Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 50/66] drm/radeon: fix typo in ci_stop_dpm() Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 51/66] drm/radeon: fix typo in golden register setup on evergreen Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 54/66] DMA, CMA: fix possible memory leak Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 55/66] ring-buffer: Check if buffer exists before polling Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 56/66] drivers/rtc/rtc-puv3.c: remove "&dev->" for typo issue Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 57/66] drivers/rtc/rtc-puv3.c: use dev_dbg() instead of dev_debug() " Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 58/66] powerpc: Disable RELOCATABLE for COMPILE_TEST with PPC64 Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 59/66] Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option" Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 60/66] x86-64, espfix: Dont leak bits 31:16 of %esp returning to 16-bit stack Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 61/66] x86, espfix: Move espfix definitions into a separate header file Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 62/66] x86, espfix: Fix broken header guard Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 63/66] x86, espfix: Make espfix64 a Kconfig option, fix UML Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 64/66] x86, espfix: Make it possible to disable 16-bit support Greg Kroah-Hartman
2014-07-15 23:17 ` [PATCH 3.14 65/66] x86, ioremap: Speed up check for RAM pages Greg Kroah-Hartman
2014-07-15 23:18 ` [PATCH 3.14 66/66] ACPI / battery: Retry to get battery information if failed during probing Greg Kroah-Hartman
2014-07-16  4:27 ` [PATCH 3.14 00/66] 3.14.13-stable review Guenter Roeck
2014-07-16 23:09 ` Greg Kroah-Hartman
2014-07-17 13:23 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140715231702.585842404@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox