From: Wen Congyang <wency@cn.fujitsu.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>,
Tang Chen <tangchen@cn.fujitsu.com>,
mingo@redhat.com, miaox@cn.fujitsu.com,
linux-kernel@vger.kernel.org, linux-numa@vger.kernel.org
Subject: Re: [PATCH] Do not use cpu_to_node() to find an offlined cpu's node.
Date: Wed, 10 Oct 2012 17:33:41 +0800 [thread overview]
Message-ID: <507540F5.7040501@cn.fujitsu.com> (raw)
In-Reply-To: <1349860216.7880.105.camel@twins>
At 10/10/2012 05:10 PM, Peter Zijlstra Wrote:
> On Tue, 2012-10-09 at 16:27 -0700, David Rientjes wrote:
>> On Tue, 9 Oct 2012, Peter Zijlstra wrote:
>>
>>> Well the code they were patching is in the wakeup path. As I think Tang
>>> said, we leave !runnable tasks on whatever cpu they ran on last, even if
>>> that cpu is offlined, we try and fix up state when we get a wakeup.
>>>
>>> On wakeup, it tries to find a cpu to run on and will try a cpu of the
>>> same node first.
>>>
>>> Now if that node's entirely gone away, it appears the cpu_to_node() map
>>> will not return a valid node number.
>>>
>>> I think that's a change in behaviour, it didn't used to do that afaik.
>>> Certainly this code hasn't change in a while.
>>>
>>
>> If cpu_to_node() always returns a valid node id even if all cpus on the
>> node are offline, then the cpumask_of_node() implementation, which the
>> sched code is using, should either return an empty cpumask (if
>> node_to_cpumask_map[nid] isn't freed) or cpu_online_mask. The change in
>> behavior here occurred because
>> cpu_hotplug-unmap-cpu2node-when-the-cpu-is-hotremoved.patch in -mm doesn't
>> return a valid node id and forces it to return -1 so a kzalloc_node(...,
>> -1) fallsback to allocate anywhere.
>
> I think that's broken semantics.. so far the entire cpu<->node mapping
> was invariant during hotplug. Changing that is going to be _very_
> interesting and cannot be done lightly.
>
> Because as I said, per-cpu memory is preserved over hotplug, and that
> has numa affinity.
Hmm, if per-cpu memory is preserved, and we can't offline and remove
this memory. So we can't offline the node.
But, if the node is hot added, and per-cpu memory doesn't use the
memory on this node. We can hotremove cpu/memory on this node, and then
offline this node.
Before the cpu is hotadded, cpu's node is -1. We set cpu<->node mapping
when it is hotadded. So the entire cpu<->node mapping was not invariant
during hotplug.
So it is why I try to clear it when the cpu is hot-removed.
As we need the mapping to migrate a task to the cpu on the same node first,
I think we can clear the mapping when the node is offlined.
Thanks
Wen Congyang
>
> So for now, let me NACK that patch. You cannot go change stuff like
> that.
>
>>
>> But if you only need cpu_to_node() when waking up to find a runnable cpu
>> for this NUMA information, then I think you can just change the
>> kzalloc_node() in alloc_{fair,rt}_sched_group() to do
>> kzalloc(..., cpu_online(cpu) ? cpu_to_node(cpu) : NUMA_NO_NODE).
>
> That's a confusing statement, the wakeup stuff and the
> alloc_{fair,rt}_sched_group() stuff are unrelated, although both sites
> might need fixing if we're going to go ahead with this.
>
next prev parent reply other threads:[~2012-10-10 9:33 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-08 2:59 [PATCH] Do not use cpu_to_node() to find an offlined cpu's node Tang Chen
2012-10-09 6:21 ` David Rientjes
2012-10-09 8:34 ` Wen Congyang
2012-10-09 8:39 ` Tang Chen
2012-10-09 10:04 ` David Rientjes
2012-10-09 10:22 ` Wen Congyang
2012-10-09 20:00 ` David Rientjes
2012-10-09 10:57 ` Peter Zijlstra
2012-10-09 20:36 ` David Rientjes
2012-10-09 20:47 ` Peter Zijlstra
2012-10-09 23:27 ` David Rientjes
2012-10-10 2:06 ` Wen Congyang
2012-10-10 3:48 ` Wen Congyang
2012-10-10 9:10 ` Peter Zijlstra
2012-10-10 9:33 ` Wen Congyang [this message]
2012-10-10 9:51 ` Peter Zijlstra
2012-10-10 10:10 ` Wen Congyang
2012-10-10 10:07 ` Peter Zijlstra
2012-10-10 20:30 ` David Rientjes
2012-10-10 20:37 ` Andrew Morton
2012-10-10 20:57 ` David Rientjes
2012-10-18 0:52 ` David Rientjes
2012-10-18 2:51 ` Tang Chen
2012-10-18 3:29 ` David Rientjes
2012-10-19 11:18 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=507540F5.7040501@cn.fujitsu.com \
--to=wency@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-numa@vger.kernel.org \
--cc=miaox@cn.fujitsu.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=tangchen@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.