linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [BUG] rebuild_sched_domains considered dangerous
Date: Wed, 11 May 2011 11:17:52 -0500	[thread overview]
Message-ID: <4DCAB6B0.8020904@linux.vnet.ibm.com> (raw)
In-Reply-To: <1305036563.2914.80.camel@laptop>

On 05/10/2011 09:09 AM, Peter Zijlstra wrote:
> On Mon, 2011-05-09 at 16:26 -0500, Jesse Larrew wrote:
>>
>> According the the Power firmware folks, updating the home node of a
>> virtual cpu happens rather infrequently. The VPHN code currently
>> checks for topology updates every 60 seconds, but we can poll less
>> frequently if it helps. I chose 60 second intervals simply because
>> that's how often they check the topology on s390. ;-)
> 
> This just makes me shudder, so you poll the state? Meaning that the vcpu
> can actually run 99% of the time on another node?
> 
> What's the point of this if the vcpu scheduler can move the vcpu around
> much faster?
> 

Based on my discussion with the firmware folks, it sounds like the hypervisor will never automatically move vcpus around on its own. The firmware is designed to set the cpu home node at partition boot, then wait for the customer to run a tool to rebalance the affinity. Moving vcpus around costs performance, so they want to let the customer decide when to shuffle the vcpus. 

>From the kernel's perspective, we can expect to see occasional batches of vcpus updating at once, after which the topology should remain fixed until the tool is run again.

>> As for updating the memory topology, there are cases where changing
>> the home node of a virtual cpu doesn't affect the memory topology. If
>> it does, there is a separate notification system for memory topology
>> updates that is independent from the cpu updates. I plan to start
>> working on a patch set to enable memory topology updates in the kernel
>> in the coming weeks, but I wanted to get the cpu patches out on the
>> list so we could start having these debates. :) 
> 
> Well, they weren't put out on a list (well maybe on the ppc list but
> that's the same as not posting them from my pov), they were merged (and
> thus declared done) that's not how you normally start a debate.
> 

That's a fair point. At the time, I didn't expect anyone outside of the PPC community to care much about a PPC-specific patch set, but I see now why it's important to keep everyone in the loop. Sorry about that. I'll be sure to send any future patches to LKML as well.

> I would really like to see both patch-sets together. Also, I'm not at
> all convinced its a sane thing to do. Pretty much all NUMA aware
> software I know of assumes that CPU<->NODE relations are static,
> breaking that in kernel renders all existing software broken.
> 

I suspect that's true. Then again, shouldn't it be the capabilities of the hardware that dictates what the software does, rather than the other way around?

-- 

Jesse Larrew
Software Engineer, Linux on Power Kernel Team
IBM Linux Technology Center
Phone: (512) 973-2052 (T/L: 363-2052)
jlarrew@linux.vnet.ibm.com

  reply	other threads:[~2011-05-11 16:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-09  2:58 [BUG] rebuild_sched_domains considered dangerous Benjamin Herrenschmidt
2011-03-09 10:19 ` Peter Zijlstra
2011-03-09 11:33   ` Peter Zijlstra
2011-03-09 13:15     ` Martin Schwidefsky
2011-03-09 13:19       ` Peter Zijlstra
2011-03-09 13:31         ` Martin Schwidefsky
2011-03-09 13:33           ` Peter Zijlstra
2011-03-09 13:46             ` Martin Schwidefsky
2011-03-09 13:54               ` Peter Zijlstra
2011-03-09 15:26     ` Steven Rostedt
2011-03-09 13:01   ` Peter Zijlstra
2011-03-10 14:10     ` Peter Zijlstra
2011-04-20 10:07       ` Peter Zijlstra
2011-04-20 22:01         ` Benjamin Herrenschmidt
2011-05-09 21:26           ` Jesse Larrew
2011-05-10 14:09             ` Peter Zijlstra
2011-05-11 16:17               ` Jesse Larrew [this message]
2011-06-03 14:47                 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DCAB6B0.8020904@linux.vnet.ibm.com \
    --to=jlarrew@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=peterz@infradead.org \
    --cc=schwidefsky@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).