Re: [RFC v3] sched/topology: fix kernel crash when a CPU is hotplugged in a memoryless node

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Laurent Vivier <lvivier@redhat.com>,
	linux-kernel@vger.kernel.org,
	Michael Bringmann <mwb@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@redhat.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Borislav Petkov <bp@suse.de>,
	linuxppc-dev@lists.ozlabs.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [RFC v3] sched/topology: fix kernel crash when a CPU is hotplugged in a memoryless node
Date: Mon, 18 Mar 2019 16:17:30 +0530	[thread overview]
Message-ID: <20190318104730.GA4450@linux.vnet.ibm.com> (raw)
In-Reply-To: <20190305115952.GH32477@hirez.programming.kicks-ass.net>

> > node 0 (because firmware doesn't provide the distance information for
> > memoryless/cpuless nodes):
> > 
> >   node   0   1   2   3
> >     0:  10  40  10  10
> >     1:  40  10  40  40
> >     2:  10  40  10  10
> >     3:  10  40  10  10
> 
> *groan*... what does it do for things like percpu memory? ISTR the
> per-cpu chunks are all allocated early too. Having them all use memory
> out of node-0 would seem sub-optimal.

In the specific failing case, there is only one node with memory; all other
nodes are cpu only nodes.

However in the generic case since its just a cpu hotplug ops, the memory
allocated for per-cpu chunks allocated early would remain.

May be Michael Ellerman can correct me here.

> 
> > We should have:
> > 
> >   node   0   1   2   3
> >     0:  10  40  40  40
> >     1:  40  10  40  40
> >     2:  40  40  10  40
> >     3:  40  40  40  10
> 
> Can it happen that it introduces a new distance in the table? One that
> hasn't been seen before? This example only has 10 and 40, but suppose
> the new node lands at distance 20 (or 80); can such a thing happen?
> 
> If not; why not?

Yes distances can be 20, 40 or 80. There is nothing that makes the node
distance to be 40 always.

> So you're relying on sched_domain_numa_masks_set/clear() to fix this up,
> but that in turn relies on the sched_domain_numa_levels thing to stay
> accurate.
> 
> This all seems very fragile and unfortunate.
> 

Any reasons why this is fragile?

-- 
Thanks and Regards
Srikar Dronamraju

WARNING: multiple messages have this Message-ID (diff)

From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Laurent Vivier <lvivier@redhat.com>,
	linux-kernel@vger.kernel.org,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Borislav Petkov <bp@suse.de>,
	David Gibson <david@gibson.dropbear.id.au>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Michael Bringmann <mwb@linux.vnet.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, Ingo Molnar <mingo@redhat.com>
Subject: Re: [RFC v3] sched/topology: fix kernel crash when a CPU is hotplugged in a memoryless node
Date: Mon, 18 Mar 2019 16:17:30 +0530	[thread overview]
Message-ID: <20190318104730.GA4450@linux.vnet.ibm.com> (raw)
In-Reply-To: <20190305115952.GH32477@hirez.programming.kicks-ass.net>

> > node 0 (because firmware doesn't provide the distance information for
> > memoryless/cpuless nodes):
> > 
> >   node   0   1   2   3
> >     0:  10  40  10  10
> >     1:  40  10  40  40
> >     2:  10  40  10  10
> >     3:  10  40  10  10
> 
> *groan*... what does it do for things like percpu memory? ISTR the
> per-cpu chunks are all allocated early too. Having them all use memory
> out of node-0 would seem sub-optimal.

In the specific failing case, there is only one node with memory; all other
nodes are cpu only nodes.

However in the generic case since its just a cpu hotplug ops, the memory
allocated for per-cpu chunks allocated early would remain.

May be Michael Ellerman can correct me here.

> 
> > We should have:
> > 
> >   node   0   1   2   3
> >     0:  10  40  40  40
> >     1:  40  10  40  40
> >     2:  40  40  10  40
> >     3:  40  40  40  10
> 
> Can it happen that it introduces a new distance in the table? One that
> hasn't been seen before? This example only has 10 and 40, but suppose
> the new node lands at distance 20 (or 80); can such a thing happen?
> 
> If not; why not?

Yes distances can be 20, 40 or 80. There is nothing that makes the node
distance to be 40 always.

> So you're relying on sched_domain_numa_masks_set/clear() to fix this up,
> but that in turn relies on the sched_domain_numa_levels thing to stay
> accurate.
> 
> This all seems very fragile and unfortunate.
> 

Any reasons why this is fragile?

-- 
Thanks and Regards
Srikar Dronamraju

next prev parent reply	other threads:[~2019-03-18 10:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-04 19:59 [RFC v3] sched/topology: fix kernel crash when a CPU is hotplugged in a memoryless node Laurent Vivier
2019-03-04 19:59 ` Laurent Vivier
2019-03-05 11:59 ` Peter Zijlstra
2019-03-05 11:59   ` Peter Zijlstra
2019-03-18 10:47   ` Srikar Dronamraju [this message]
2019-03-18 10:47     ` Srikar Dronamraju
2019-03-18 11:26     ` Peter Zijlstra
2019-03-18 11:26       ` Peter Zijlstra
2019-03-15 11:12 ` Laurent Vivier
2019-03-15 11:12   ` Laurent Vivier
2019-03-15 12:25   ` Peter Zijlstra
2019-03-15 12:25     ` Peter Zijlstra
2019-03-15 13:05     ` Laurent Vivier
2019-03-15 13:05       ` Laurent Vivier
2019-03-18 11:06   ` Srikar Dronamraju
2019-03-18 11:06     ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190318104730.GA4450@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=bp@suse.de \
    --cc=david@gibson.dropbear.id.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lvivier@redhat.com \
    --cc=mingo@redhat.com \
    --cc=mwb@linux.vnet.ibm.com \
    --cc=nfont@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=suravee.suthikulpanit@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.