From: Tang Chen <tangchen@cn.fujitsu.com>
To: linux-kernel@vger.kernel.org, x86@kernel.org, linux-numa@vger.kernel.org
Cc: Wen Congyang <wency@cn.fujitsu.com>
Subject: [BUG] Failed to online cpu on a hot-added NUMA node.
Date: Mon, 10 Sep 2012 18:31:52 +0800 [thread overview]
Message-ID: <504DC198.6080602@cn.fujitsu.com> (raw)
Hi,
When I hot add a node, all the cpus on it are offline.
When I online one of them, I got the following error message.
[ 762.759364] Call Trace:
[ 762.759371] [<ffffffff8106ec2f>] warn_slowpath_common+0x7f/0xc0
[ 762.759374] [<ffffffff8106ec8a>] warn_slowpath_null+0x1a/0x20
[ 762.759377] [<ffffffff810b463b>] init_sched_groups_power+0xcb/0xd0
[ 762.759380] [<ffffffff810b49fc>] build_sched_domains+0x3bc/0x6a0
[ 762.759387] [<ffffffff810e2e73>] ? __lock_release+0x133/0x1a0
[ 762.759390] [<ffffffff810b51f7>] partition_sched_domains+0x347/0x530
[ 762.759393] [<ffffffff810b4ff2>] ? partition_sched_domains+0x142/0x530
[ 762.759399] [<ffffffff81102bd3>] cpuset_update_active_cpus+0x83/0x90
[ 762.759402] [<ffffffff810b5418>] cpuset_cpu_active+0x38/0x70
[ 762.759411] [<ffffffff81681167>] notifier_call_chain+0x67/0x150
[ 762.759417] [<ffffffff81670bff>] ? native_cpu_up+0x194/0x1c7
[ 762.759422] [<ffffffff810a36be>] __raw_notifier_call_chain+0xe/0x10
[ 762.759426] [<ffffffff81072d70>] __cpu_notify+0x20/0x40
[ 762.759430] [<ffffffff81672af7>] _cpu_up+0xfc/0x144
[ 762.759433] [<ffffffff81672c12>] cpu_up+0xd3/0xe6
[ 762.759439] [<ffffffff81662a1c>] store_online+0x9c/0xd0
[ 762.759447] [<ffffffff81441f80>] dev_attr_store+0x20/0x30
[ 762.759454] [<ffffffff812547a3>] sysfs_write_file+0xa3/0x100
[ 762.759462] [<ffffffff811d62a0>] vfs_write+0xd0/0x1a0
[ 762.759465] [<ffffffff811d6474>] sys_write+0x54/0xa0
[ 762.759471] [<ffffffff81686269>] system_call_fastpath+0x16/0x1b
[ 762.759473] ---[ end trace 75068e651299460b ]---
[ 762.759493] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
In init_sched_groups_power(), we got a NULL pointer sg, which should
have been initialized in build_overlap_sched_groups().
In build_overlap_sched_groups(),
cpumask_copy(sg_span, sched_domain_span(child));
the new cpu is not set in sched_domain_span(child). It should be set in
build_sched_domain(),
cpumask_and(sched_domain_span(sd), cpu_map, tl->mask(cpu));
But on NUMA topology level, the cpus' masks on the new node is not set
in array sched_domains_numa_masks when they are hot added, which means
they are not set in tl->mask(cpu).
Should we set the hot added cpu masks in sched_domains_numa_masks when
they are onlined ?
If I want to fix this, do I need to add a new notifier to the notify
chain ?
Thanks. :)
reply other threads:[~2012-09-10 10:31 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=504DC198.6080602@cn.fujitsu.com \
--to=tangchen@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-numa@vger.kernel.org \
--cc=wency@cn.fujitsu.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.