All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: David Rientjes <rientjes@google.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>,
	mingo@redhat.com, miaox@cn.fujitsu.com, wency@cn.fujitsu.com,
	linux-kernel@vger.kernel.org, linux-numa@vger.kernel.org
Subject: Re: [PATCH] Do not use cpu_to_node() to find an offlined cpu's node.
Date: Tue, 09 Oct 2012 22:47:56 +0200	[thread overview]
Message-ID: <1349815676.7880.85.camel@twins> (raw)
In-Reply-To: <alpine.DEB.2.00.1210091328030.32588@chino.kir.corp.google.com>

On Tue, 2012-10-09 at 13:36 -0700, David Rientjes wrote:
> On Tue, 9 Oct 2012, Peter Zijlstra wrote:
> 
> > On Mon, 2012-10-08 at 10:59 +0800, Tang Chen wrote:
> > > If a cpu is offline, its nid will be set to -1, and cpu_to_node(cpu) will
> > > return -1. As a result, cpumask_of_node(nid) will return NULL. In this case,
> > > find_next_bit() in for_each_cpu will get a NULL pointer and cause panic.
> > 
> > Hurm,. this is new, right? Who is changing all these semantics without
> > auditing the tree and informing all affected people?
> > 
> 
> I've nacked the patch that did it because I think it should be done from 
> the generic cpu hotplug code only at the CPU_DEAD level with a per-arch 
> callback to fixup whatever cpu-to-node mappings they maintain since 
> processes can reenter the scheduler at CPU_DYING.

Well the code they were patching is in the wakeup path. As I think Tang
said, we leave !runnable tasks on whatever cpu they ran on last, even if
that cpu is offlined, we try and fix up state when we get a wakeup.

On wakeup, it tries to find a cpu to run on and will try a cpu of the
same node first.

Now if that node's entirely gone away, it appears the cpu_to_node() map
will not return a valid node number.

I think that's a change in behaviour, it didn't used to do that afaik.
Certainly this code hasn't change in a while.


> The whole issue seems to be because alloc_{fair,rt}_sched_group() does an 
> iteration over all possible cpus (not all online cpus) and does 
> kzalloc_node() which references a now-offlined node.  Changing it to -1 
> makes the slab code fallback to any online node.

Right, that's because the rq structures are assumed always present. What
I cannot remember is why I'm not using per-cpu allocations there,
because that's exactly what it looks like it wants to be.

> What I think we need to do instead of hacking only the acpi code and not 
> standardizing this across the kernel is:

Right, what I don't understand is wtf ACPI has to do with anything. We
have plenty cpu hotplug code, ACPI isn't involved in any of that last
time I checked.

>  - reset cpu-to-node with a per-arch callback in generic cpu hotplug code 
>    at CPU_DEAD, and
> 
>  - do an iteration over all possible cpus for node hot-remove ensuring 
>    there are no stale references.

Why do we need to clear cpu-to-node maps? are we going to change the
topology at runtime? What are you going to do with per-cpu stuff,
per-cpu memory isn't freed on hotplug, so its node relation is static.

/me confused..

  reply	other threads:[~2012-10-09 20:47 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-08  2:59 [PATCH] Do not use cpu_to_node() to find an offlined cpu's node Tang Chen
2012-10-09  6:21 ` David Rientjes
2012-10-09  8:34   ` Wen Congyang
2012-10-09  8:39     ` Tang Chen
2012-10-09 10:04       ` David Rientjes
2012-10-09 10:22         ` Wen Congyang
2012-10-09 20:00           ` David Rientjes
2012-10-09 10:57 ` Peter Zijlstra
2012-10-09 20:36   ` David Rientjes
2012-10-09 20:47     ` Peter Zijlstra [this message]
2012-10-09 23:27       ` David Rientjes
2012-10-10  2:06         ` Wen Congyang
2012-10-10  3:48           ` Wen Congyang
2012-10-10  9:10         ` Peter Zijlstra
2012-10-10  9:33           ` Wen Congyang
2012-10-10  9:51             ` Peter Zijlstra
2012-10-10 10:10               ` Wen Congyang
2012-10-10 10:07                 ` Peter Zijlstra
2012-10-10 20:30           ` David Rientjes
2012-10-10 20:37             ` Andrew Morton
2012-10-10 20:57               ` David Rientjes
2012-10-18  0:52                 ` David Rientjes
2012-10-18  2:51                   ` Tang Chen
2012-10-18  3:29                     ` David Rientjes
2012-10-19 11:18                       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1349815676.7880.85.camel@twins \
    --to=peterz@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-numa@vger.kernel.org \
    --cc=miaox@cn.fujitsu.com \
    --cc=mingo@redhat.com \
    --cc=rientjes@google.com \
    --cc=tangchen@cn.fujitsu.com \
    --cc=wency@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.