[PATCH] workqueue: initialize cpumask of wq_numa_possible_cpumask

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
To: <tj@kernel.org>, <laijs@cn.fujitsu.com>
Cc: <linux-kernel@vger.kernel.org>
Subject: [PATCH] workqueue: initialize cpumask of wq_numa_possible_cpumask
Date: Mon, 7 Jul 2014 02:21:21 +0900	[thread overview]
Message-ID: <53B98591.5030000@jp.fujitsu.com> (raw)

When hot-adding and onlining CPU, kernel panic occurs, showing following
call trace.

 BUG: unable to handle kernel paging request at 0000000000001d08
 IP: [<ffffffff8114acfd>] __alloc_pages_nodemask+0x9d/0xb10
 PGD 0
 Oops: 0000 [#1] SMP
 ...
 Call Trace:
  [<ffffffff812b8745>] ? cpumask_next_and+0x35/0x50
  [<ffffffff810a3283>] ? find_busiest_group+0x113/0x8f0
  [<ffffffff81193bc9>] ? deactivate_slab+0x349/0x3c0
  [<ffffffff811926f1>] new_slab+0x91/0x300
  [<ffffffff815de95a>] __slab_alloc+0x2bb/0x482
  [<ffffffff8105bc1c>] ? copy_process.part.25+0xfc/0x14c0
  [<ffffffff810a3c78>] ? load_balance+0x218/0x890
  [<ffffffff8101a679>] ? sched_clock+0x9/0x10
  [<ffffffff81105ba9>] ? trace_clock_local+0x9/0x10
  [<ffffffff81193d1c>] kmem_cache_alloc_node+0x8c/0x200
  [<ffffffff8105bc1c>] copy_process.part.25+0xfc/0x14c0
  [<ffffffff81114d0d>] ? trace_buffer_unlock_commit+0x4d/0x60
  [<ffffffff81085a80>] ? kthread_create_on_node+0x140/0x140
  [<ffffffff8105d0ec>] do_fork+0xbc/0x360
  [<ffffffff8105d3b6>] kernel_thread+0x26/0x30
  [<ffffffff81086652>] kthreadd+0x2c2/0x300
  [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60
  [<ffffffff815f20ec>] ret_from_fork+0x7c/0xb0
  [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60

In my investigation, I found the root cause is wq_numa_possible_cpumask.
All entries of wq_numa_possible_cpumask is allocated by
alloc_cpumask_var_node(). And these entries are used without initializing.
So these entries have wrong value.

When hot-adding and onlining CPU, wq_update_unbound_numa() is called.
wq_update_unbound_numa() calls alloc_unbound_pwq(). And alloc_unbound_pwq()
calls get_unbound_pool(). In get_unbound_pool(), worker_pool->node is set
as follow:

#kernel/workqueue.c
3592         /* if cpumask is contained inside a NUMA node, we belong to that node */
3593         if (wq_numa_enabled) {
3594                 for_each_node(node) {
3595                         if (cpumask_subset(pool->attrs->cpumask,
3596                                            wq_numa_possible_cpumask[node])) {
3597                                 pool->node = node;
3598                                 break;
3599                         }
3600                 }
3601         }

But wq_numa_possible_cpumask[node] does not have correct cpumask. So, wrong
node is selected. As a result, kernel panic occurs.

By this patch, all entries of wq_numa_possible_cpumask are allocated by
zalloc_cpumask_var_node to initialize them. And the panic disappeared.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
---
 kernel/workqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 6203d29..b393ded 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3338,7 +3338,7 @@ struct workqueue_attrs *alloc_workqueue_attrs(gfp_t gfp_mask)
 	attrs = kzalloc(sizeof(*attrs), gfp_mask);
 	if (!attrs)
 		goto fail;
-	if (!alloc_cpumask_var(&attrs->cpumask, gfp_mask))
+	if (!zalloc_cpumask_var(&attrs->cpumask, gfp_mask))
 		goto fail;

 	cpumask_copy(attrs->cpumask, cpu_possible_mask);

next             reply	other threads:[~2014-07-06 17:23 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-06 17:21 Yasuaki Ishimatsu [this message]
2014-07-07  0:19 ` [PATCH] workqueue: initialize cpumask of wq_numa_possible_cpumask Lai Jiangshan
2014-07-07  0:33   ` Yasuaki Ishimatsu
2014-07-07  1:04     ` Lai Jiangshan
2014-07-07  1:20       ` Yasuaki Ishimatsu
2014-07-07  0:47   ` [resend PATCH] " Yasuaki Ishimatsu
2014-07-07 14:01     ` [PATCH wq/for-3.16-fixes] workqueue: zero cpumask of wq_numa_possible_cpumask on init Tejun Heo

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:6203d29 dfblob:b393ded )
 OR (
bs:"[PATCH] workqueue: initialize cpumask of wq_numa_possible_cpumask" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53B98591.5030000@jp.fujitsu.com \
    --to=isimatu.yasuaki@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).