All of lore.kernel.org
 help / color / mirror / Atom feed
From: peterz@infradead.org (Peter Zijlstra)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC 0/6] rework sched_domain topology description
Date: Mon, 17 Mar 2014 12:52:25 +0100	[thread overview]
Message-ID: <20140317115225.GZ9987@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <532060E7.7010203@arm.com>

On Wed, Mar 12, 2014 at 01:28:07PM +0000, Dietmar Eggemann wrote:
> On 11/03/14 13:17, Peter Zijlstra wrote:
> > On Sat, Mar 08, 2014 at 12:40:58PM +0000, Dietmar Eggemann wrote:
> >>>
> >>> I don't have a strong opinion about using or not a cpu argument for
> >>> setting the flags of a level (it was part of the initial proposal
> >>> before we start to completely rework the build of sched_domain)
> >>> Nevertheless, I see one potential concern that you can have completely
> >>> different flags configuration of the same sd level of 2 cpus.
> >>
> >> Could you elaborate a little bit further regarding the last sentence? Do you
> >> think that those completely different flags configuration would make it
> >> impossible, that the load-balance code could work at all at this sd?
> > 
> > So a problem with such an interfaces is that is makes it far too easy to
> > generate completely broken domains.
> 
> I see the point. What I'm still struggling with is to understand why
> this interface is worse then the one where we set-up additional,
> adjacent sd levels with new cpu_foo_mask functions plus different static
> sd-flags configurations and rely on the sd degenerate functionality in
> the core scheduler to fold these levels together to achieve different
> per cpu sd flags configurations.

Well, the folding of SD levels is 'safe' in that it keeps domains
internally consistent.

> IMHO, exposing struct sched_domain_topology_level bar_topology[] to the
> arch is the reason why the core scheduler has to check if the arch
> provides a sane sd setup in both cases.

Up to a point yes. On the other hand; the reason we have the degenerate
stuff is because the topology was generic and might contain pointless
levels because the architecture didn't actually have them.

By moving the topology setup into the arch; that could be made to go
away (not sure you want to do that, but you could).

But yes, by moving the topology setup out of the core code, you need
some extra validation to make sure that whatever you're fed makes some
kind of sense.

> > You can, for two cpus in the same domain provide, different flags; such
> > a configuration doesn't make any sense at all.
> > 
> > Now I see why people would like to have this; but unless we can make it
> > robust I'd be very hesitant to go this route.
> > 
> 
> By making it robust, I guess you mean that the core scheduler has to
> check that the provided set-ups are sane, something like the following
> code snippet in sd_init()
> 
> if (WARN_ONCE(tl->sd_flags & ~TOPOLOGY_SD_FLAGS,
> 		"wrong sd_flags in topology description\n"))
> 	tl->sd_flags &= ~TOPOLOGY_SD_FLAGS;
> 
> but for per cpu set-up's.

So a domain is principally a group of CPUs with the same properties.
However per-cpu domain attributes allows you to specify different domain
properties within the one domain mask.

That's completely broken.

So the way to validate something like that would be:

	cpu = cpumask_first(tl->mask());
	flags = tl->flags(cpu);

	for (;cpu = cpumask_next(cpu, tl->mask()), cpu < nr_cpu_ids;)
		BUG_ON(tl->flags(cpu) != flags);

Or something along those lines.

But for me its far easier to think in the simple one domain one flags
scenario. The whole degenerate folding is a very simple optimization
simply removing redundant levels.

WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"fenghua.yu@intel.com" <fenghua.yu@intel.com>,
	"schwidefsky@de.ibm.com" <schwidefsky@de.ibm.com>,
	"james.hogan@imgtec.com" <james.hogan@imgtec.com>,
	"cmetcalf@tilera.com" <cmetcalf@tilera.com>,
	"benh@kernel.crashing.org" <benh@kernel.crashing.org>,
	"linux@arm.linux.org.uk" <linux@arm.linux.org.uk>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"preeti@linux.vnet.ibm.com" <preeti@linux.vnet.ibm.com>,
	"linaro-kernel@lists.linaro.org" <linaro-kernel@lists.linaro.org>
Subject: Re: [RFC 0/6] rework sched_domain topology description
Date: Mon, 17 Mar 2014 12:52:25 +0100	[thread overview]
Message-ID: <20140317115225.GZ9987@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <532060E7.7010203@arm.com>

On Wed, Mar 12, 2014 at 01:28:07PM +0000, Dietmar Eggemann wrote:
> On 11/03/14 13:17, Peter Zijlstra wrote:
> > On Sat, Mar 08, 2014 at 12:40:58PM +0000, Dietmar Eggemann wrote:
> >>>
> >>> I don't have a strong opinion about using or not a cpu argument for
> >>> setting the flags of a level (it was part of the initial proposal
> >>> before we start to completely rework the build of sched_domain)
> >>> Nevertheless, I see one potential concern that you can have completely
> >>> different flags configuration of the same sd level of 2 cpus.
> >>
> >> Could you elaborate a little bit further regarding the last sentence? Do you
> >> think that those completely different flags configuration would make it
> >> impossible, that the load-balance code could work at all at this sd?
> > 
> > So a problem with such an interfaces is that is makes it far too easy to
> > generate completely broken domains.
> 
> I see the point. What I'm still struggling with is to understand why
> this interface is worse then the one where we set-up additional,
> adjacent sd levels with new cpu_foo_mask functions plus different static
> sd-flags configurations and rely on the sd degenerate functionality in
> the core scheduler to fold these levels together to achieve different
> per cpu sd flags configurations.

Well, the folding of SD levels is 'safe' in that it keeps domains
internally consistent.

> IMHO, exposing struct sched_domain_topology_level bar_topology[] to the
> arch is the reason why the core scheduler has to check if the arch
> provides a sane sd setup in both cases.

Up to a point yes. On the other hand; the reason we have the degenerate
stuff is because the topology was generic and might contain pointless
levels because the architecture didn't actually have them.

By moving the topology setup into the arch; that could be made to go
away (not sure you want to do that, but you could).

But yes, by moving the topology setup out of the core code, you need
some extra validation to make sure that whatever you're fed makes some
kind of sense.

> > You can, for two cpus in the same domain provide, different flags; such
> > a configuration doesn't make any sense at all.
> > 
> > Now I see why people would like to have this; but unless we can make it
> > robust I'd be very hesitant to go this route.
> > 
> 
> By making it robust, I guess you mean that the core scheduler has to
> check that the provided set-ups are sane, something like the following
> code snippet in sd_init()
> 
> if (WARN_ONCE(tl->sd_flags & ~TOPOLOGY_SD_FLAGS,
> 		"wrong sd_flags in topology description\n"))
> 	tl->sd_flags &= ~TOPOLOGY_SD_FLAGS;
> 
> but for per cpu set-up's.

So a domain is principally a group of CPUs with the same properties.
However per-cpu domain attributes allows you to specify different domain
properties within the one domain mask.

That's completely broken.

So the way to validate something like that would be:

	cpu = cpumask_first(tl->mask());
	flags = tl->flags(cpu);

	for (;cpu = cpumask_next(cpu, tl->mask()), cpu < nr_cpu_ids;)
		BUG_ON(tl->flags(cpu) != flags);

Or something along those lines.

But for me its far easier to think in the simple one domain one flags
scenario. The whole degenerate folding is a very simple optimization
simply removing redundant levels.

  parent reply	other threads:[~2014-03-17 11:52 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-05  7:18 [RFC 0/6] rework sched_domain topology description Vincent Guittot
2014-03-05  7:18 ` Vincent Guittot
2014-03-05  7:18 ` [RFC 1/6] sched: remove unused SCHED_INIT_NODE Vincent Guittot
2014-03-05  7:18 ` [PATCH 2/6] sched: rework of sched_domain topology definition Vincent Guittot
2014-03-05  7:18   ` Vincent Guittot
2014-03-05 17:09   ` Dietmar Eggemann
2014-03-05 17:09     ` Dietmar Eggemann
2014-03-06  8:32     ` Vincent Guittot
2014-03-06  8:32       ` Vincent Guittot
2014-03-11 10:31       ` Peter Zijlstra
2014-03-11 10:31         ` Peter Zijlstra
2014-03-11 13:27         ` Vincent Guittot
2014-03-11 13:27           ` Vincent Guittot
2014-03-11 13:48           ` Dietmar Eggemann
2014-03-11 13:48             ` Dietmar Eggemann
2014-03-05  7:18 ` [RFC 3/6] sched: s390: create a dedicated topology table Vincent Guittot
2014-03-05  7:18 ` [RFC 4/6] sched: powerpc: " Vincent Guittot
2014-03-11 10:08   ` Preeti U Murthy
2014-03-11 10:08     ` Preeti U Murthy
2014-03-11 13:18     ` Vincent Guittot
2014-03-11 13:18       ` Vincent Guittot
2014-03-12  4:42       ` Preeti U Murthy
2014-03-12  4:42         ` Preeti U Murthy
2014-03-12  7:44         ` Vincent Guittot
2014-03-12  7:44           ` Vincent Guittot
2014-03-12 11:04           ` Dietmar Eggemann
2014-03-12 11:04             ` Dietmar Eggemann
2014-03-14  2:30             ` Preeti U Murthy
2014-03-14  2:30               ` Preeti U Murthy
2014-03-14  2:14           ` Preeti U Murthy
2014-03-14  2:14             ` Preeti U Murthy
2014-03-05  7:18 ` [RFC 5/6] sched: add a new SD_SHARE_POWERDOMAIN for sched_domain Vincent Guittot
2014-03-05  7:18 ` [RFC 6/6] sched: ARM: create a dedicated scheduler topology table Vincent Guittot
2014-03-05 22:38   ` Dietmar Eggemann
2014-03-05 22:38     ` Dietmar Eggemann
2014-03-06  8:42     ` Vincent Guittot
2014-03-06  8:42       ` Vincent Guittot
2014-03-05 23:17 ` [RFC 0/6] rework sched_domain topology description Dietmar Eggemann
2014-03-05 23:17   ` Dietmar Eggemann
2014-03-06  9:04   ` Vincent Guittot
2014-03-06  9:04     ` Vincent Guittot
2014-03-06 12:31     ` Dietmar Eggemann
2014-03-06 12:31       ` Dietmar Eggemann
2014-03-07  2:47       ` Vincent Guittot
2014-03-07  2:47         ` Vincent Guittot
2014-03-08 12:40         ` Dietmar Eggemann
2014-03-08 12:40           ` Dietmar Eggemann
2014-03-10 13:21           ` Vincent Guittot
2014-03-10 13:21             ` Vincent Guittot
2014-03-11 13:17           ` Peter Zijlstra
2014-03-11 13:17             ` Peter Zijlstra
2014-03-12 13:28             ` Dietmar Eggemann
2014-03-12 13:28               ` Dietmar Eggemann
2014-03-12 13:47               ` Vincent Guittot
2014-03-12 13:47                 ` Vincent Guittot
2014-03-13 14:07                 ` Dietmar Eggemann
2014-03-13 14:07                   ` Dietmar Eggemann
2014-03-17 11:52               ` Peter Zijlstra [this message]
2014-03-17 11:52                 ` Peter Zijlstra
2014-03-19 19:15                 ` Dietmar Eggemann
2014-03-19 19:15                   ` Dietmar Eggemann
2014-03-20  8:28                   ` Vincent Guittot
2014-03-20  8:28                     ` Vincent Guittot
2014-03-11 13:08         ` Peter Zijlstra
2014-03-11 13:08           ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140317115225.GZ9987@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.