All of lore.kernel.org
 help / color / mirror / Atom feed
From: dietmar.eggemann@arm.com (Dietmar Eggemann)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v4 5/5] sched: ARM: create a dedicated scheduler topology table
Date: Wed, 23 Apr 2014 12:46:10 +0100	[thread overview]
Message-ID: <5357A802.1030804@arm.com> (raw)
In-Reply-To: <1397209481-28542-6-git-send-email-vincent.guittot@linaro.org>

Hi,

I'm trying to use this approach of specifying different per-cpu views on
sd flags on DIE level on a TC2 platform (cluster 0 w/ CPU0/1 and cluster
1 w/ CPU2/3/4 w/o SMT). It doesn't work like in the case for the GMC/MC
sd level.

If I use the following patch (just to illustrate the issue) on top
(returning SD_SHARE_POWERDOMAIN in cpu_cpupower_flags() is bogus ... I
just needed a flag function for GDIE level):

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 71e1fec6d31a..803330d89c09 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -194,6 +194,14 @@ const struct cpumask *cpu_corepower_mask(int cpu)
 	return &cpu_topology[cpu].thread_sibling;
 }

+const struct cpumask *cpu_cpupower_mask(int cpu)
+{
+	return cpu_topology[cpu].socket_id ?
+			cpumask_of_node(cpu_to_node(cpu)) :
+			&cpu_topology[cpu].core_sibling;
+}
+
+
 static void update_siblings_masks(unsigned int cpuid)
 {
 	struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
@@ -280,11 +288,18 @@ static inline const int cpu_corepower_flags(void)
 	return SD_SHARE_PKG_RESOURCES  | SD_SHARE_POWERDOMAIN;
 }

+static inline const int cpu_cpupower_flags(void)
+{
+	return SD_SHARE_POWERDOMAIN;
+}
+
+
 static struct sched_domain_topology_level arm_topology[] = {
 #ifdef CONFIG_SCHED_MC
 	{ cpu_corepower_mask, cpu_corepower_flags, SD_INIT_NAME(GMC) },
 	{ cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
 #endif
+	{ cpu_cpupower_mask, cpu_cpupower_flags, SD_INIT_NAME(GDIE) },
 	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
 	{ NULL, },
 };

so I get the following topology:

CPU0: cpu_corepower_mask=0   (GMC)
CPU0: cpu_coregroup_mask=0-1 (MC)
CPU0:  cpu_cpupower_mask=0-1 (GDIE)
CPU0:       cpu_cpu_mask=0-4 (DIE)
CPU1: cpu_corepower_mask=1    ...
CPU1: cpu_coregroup_mask=0-1
CPU1:  cpu_cpupower_mask=0-1
CPU1:       cpu_cpu_mask=0-4
CPU2: cpu_corepower_mask=2
CPU2: cpu_coregroup_mask=2-4
CPU2:  cpu_cpupower_mask=0-4
CPU2:       cpu_cpu_mask=0-4
CPU3: cpu_corepower_mask=3
CPU3: cpu_coregroup_mask=2-4
CPU3:  cpu_cpupower_mask=0-4
CPU3:       cpu_cpu_mask=0-4
CPU4: cpu_corepower_mask=4
CPU4: cpu_coregroup_mask=2-4
CPU4:  cpu_cpupower_mask=0-4
CPU4:       cpu_cpu_mask=0-4

Firstly, I had to get rid of the cpumask_equal(cpu_map,
sched_domain_span(sd)) condition in build_sched_domains() to allow that
I can have two sd levels which span CPU 0-4 (for CPU2/3/4).

But it still doesn't work correctly:

dmesg snippet 2:

CPU0 attaching sched-domain:
 domain 0: span 0-1 level MC
  groups: 0 1
  domain 1: span 0-4 level DIE     <-- error (there's only one group)
   groups: 0-4 (cpu_power = 2048)
...
CPU2 attaching sched-domain:
 domain 0: span 2-4 level MC
  groups: 2 3 4
  domain 1: span 0-4 level GDIE
ERROR: domain->groups does not contain CPU2
   groups:
ERROR: domain->cpu_power not set

ERROR: groups don't span domain->span
...

It turns out that the function get_group() which is used a couple of
times in build_sched_groups() uses a reference to sd->child and even
though the degenerate functionality gets rid of GDIE for CPU0/1 and DIE
for CPU2/3/4 the group set-up doesn't work as expected since sd->child
for DIE is GDIE and not MC any more.
So it looks like GMC/MC level is somehow special (GMC has no sd->child
for TC2 or GMC/MC contains only one cpu per group?).

Although this problem does not effect the current patch-set, people
might think that they can apply this degenerate trick for other sd
levels as well.

I'm trying to fix get_group()/build_sched_groups() in such a way that my
example would work but so far I haven't succeeded. The question for me
remains ... is this application of the degenerate trick feasible at all
in all sd levels, i.e. does it scale? What about platforms w/ SMT sd
level which want to use this trick in GMC/MC level?

Any hints are highly appreciated here.

-- Dietmar

On 11/04/14 10:44, Vincent Guittot wrote:
> Create a dedicated topology table for ARM which will create new level to
> differentiate CPUs that can or not powergate independantly from others.
> 
> The patch gives an example of how to add domain that will take advantage of
> SD_SHARE_POWERDOMAIN.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  arch/arm/kernel/topology.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 0bc94b1..71e1fec 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -185,6 +185,15 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>  	return &cpu_topology[cpu].core_sibling;
>  }
>  
> +/*
> + * The current assumption is that we can power gate each core independently.
> + * This will be superseded by DT binding once available.
> + */
> +const struct cpumask *cpu_corepower_mask(int cpu)
> +{
> +	return &cpu_topology[cpu].thread_sibling;
> +}
> +
>  static void update_siblings_masks(unsigned int cpuid)
>  {
>  	struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> @@ -266,6 +275,20 @@ void store_cpu_topology(unsigned int cpuid)
>  		cpu_topology[cpuid].socket_id, mpidr);
>  }
>  
> +static inline const int cpu_corepower_flags(void)
> +{
> +	return SD_SHARE_PKG_RESOURCES  | SD_SHARE_POWERDOMAIN;
> +}
> +
> +static struct sched_domain_topology_level arm_topology[] = {
> +#ifdef CONFIG_SCHED_MC
> +	{ cpu_corepower_mask, cpu_corepower_flags, SD_INIT_NAME(GMC) },
> +	{ cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
> +#endif
> +	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
> +	{ NULL, },
> +};
> +
>  /*
>   * init_cpu_topology is called at boot when only one cpu is running
>   * which prevent simultaneous write access to cpu_topology array
> @@ -289,4 +312,7 @@ void __init init_cpu_topology(void)
>  	smp_wmb();
>  
>  	parse_dt_topology();
> +
> +	/* Set scheduler topology descriptor */
> +	set_sched_topology(arm_topology);
>  }
> 

WARNING: multiple messages have this Message-ID (diff)
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Vincent Guittot <vincent.guittot@linaro.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"fenghua.yu@intel.com" <fenghua.yu@intel.com>,
	"schwidefsky@de.ibm.com" <schwidefsky@de.ibm.com>,
	"cmetcalf@tilera.com" <cmetcalf@tilera.com>,
	"benh@kernel.crashing.org" <benh@kernel.crashing.org>,
	"linux@arm.linux.org.uk" <linux@arm.linux.org.uk>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>
Cc: "preeti@linux.vnet.ibm.com" <preeti@linux.vnet.ibm.com>,
	"linaro-kernel@lists.linaro.org" <linaro-kernel@lists.linaro.org>
Subject: Re: [PATCH v4 5/5] sched: ARM: create a dedicated scheduler topology table
Date: Wed, 23 Apr 2014 12:46:10 +0100	[thread overview]
Message-ID: <5357A802.1030804@arm.com> (raw)
In-Reply-To: <1397209481-28542-6-git-send-email-vincent.guittot@linaro.org>

Hi,

I'm trying to use this approach of specifying different per-cpu views on
sd flags on DIE level on a TC2 platform (cluster 0 w/ CPU0/1 and cluster
1 w/ CPU2/3/4 w/o SMT). It doesn't work like in the case for the GMC/MC
sd level.

If I use the following patch (just to illustrate the issue) on top
(returning SD_SHARE_POWERDOMAIN in cpu_cpupower_flags() is bogus ... I
just needed a flag function for GDIE level):

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 71e1fec6d31a..803330d89c09 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -194,6 +194,14 @@ const struct cpumask *cpu_corepower_mask(int cpu)
 	return &cpu_topology[cpu].thread_sibling;
 }

+const struct cpumask *cpu_cpupower_mask(int cpu)
+{
+	return cpu_topology[cpu].socket_id ?
+			cpumask_of_node(cpu_to_node(cpu)) :
+			&cpu_topology[cpu].core_sibling;
+}
+
+
 static void update_siblings_masks(unsigned int cpuid)
 {
 	struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
@@ -280,11 +288,18 @@ static inline const int cpu_corepower_flags(void)
 	return SD_SHARE_PKG_RESOURCES  | SD_SHARE_POWERDOMAIN;
 }

+static inline const int cpu_cpupower_flags(void)
+{
+	return SD_SHARE_POWERDOMAIN;
+}
+
+
 static struct sched_domain_topology_level arm_topology[] = {
 #ifdef CONFIG_SCHED_MC
 	{ cpu_corepower_mask, cpu_corepower_flags, SD_INIT_NAME(GMC) },
 	{ cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
 #endif
+	{ cpu_cpupower_mask, cpu_cpupower_flags, SD_INIT_NAME(GDIE) },
 	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
 	{ NULL, },
 };

so I get the following topology:

CPU0: cpu_corepower_mask=0   (GMC)
CPU0: cpu_coregroup_mask=0-1 (MC)
CPU0:  cpu_cpupower_mask=0-1 (GDIE)
CPU0:       cpu_cpu_mask=0-4 (DIE)
CPU1: cpu_corepower_mask=1    ...
CPU1: cpu_coregroup_mask=0-1
CPU1:  cpu_cpupower_mask=0-1
CPU1:       cpu_cpu_mask=0-4
CPU2: cpu_corepower_mask=2
CPU2: cpu_coregroup_mask=2-4
CPU2:  cpu_cpupower_mask=0-4
CPU2:       cpu_cpu_mask=0-4
CPU3: cpu_corepower_mask=3
CPU3: cpu_coregroup_mask=2-4
CPU3:  cpu_cpupower_mask=0-4
CPU3:       cpu_cpu_mask=0-4
CPU4: cpu_corepower_mask=4
CPU4: cpu_coregroup_mask=2-4
CPU4:  cpu_cpupower_mask=0-4
CPU4:       cpu_cpu_mask=0-4

Firstly, I had to get rid of the cpumask_equal(cpu_map,
sched_domain_span(sd)) condition in build_sched_domains() to allow that
I can have two sd levels which span CPU 0-4 (for CPU2/3/4).

But it still doesn't work correctly:

dmesg snippet 2:

CPU0 attaching sched-domain:
 domain 0: span 0-1 level MC
  groups: 0 1
  domain 1: span 0-4 level DIE     <-- error (there's only one group)
   groups: 0-4 (cpu_power = 2048)
...
CPU2 attaching sched-domain:
 domain 0: span 2-4 level MC
  groups: 2 3 4
  domain 1: span 0-4 level GDIE
ERROR: domain->groups does not contain CPU2
   groups:
ERROR: domain->cpu_power not set

ERROR: groups don't span domain->span
...

It turns out that the function get_group() which is used a couple of
times in build_sched_groups() uses a reference to sd->child and even
though the degenerate functionality gets rid of GDIE for CPU0/1 and DIE
for CPU2/3/4 the group set-up doesn't work as expected since sd->child
for DIE is GDIE and not MC any more.
So it looks like GMC/MC level is somehow special (GMC has no sd->child
for TC2 or GMC/MC contains only one cpu per group?).

Although this problem does not effect the current patch-set, people
might think that they can apply this degenerate trick for other sd
levels as well.

I'm trying to fix get_group()/build_sched_groups() in such a way that my
example would work but so far I haven't succeeded. The question for me
remains ... is this application of the degenerate trick feasible at all
in all sd levels, i.e. does it scale? What about platforms w/ SMT sd
level which want to use this trick in GMC/MC level?

Any hints are highly appreciated here.

-- Dietmar

On 11/04/14 10:44, Vincent Guittot wrote:
> Create a dedicated topology table for ARM which will create new level to
> differentiate CPUs that can or not powergate independantly from others.
> 
> The patch gives an example of how to add domain that will take advantage of
> SD_SHARE_POWERDOMAIN.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  arch/arm/kernel/topology.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index 0bc94b1..71e1fec 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -185,6 +185,15 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
>  	return &cpu_topology[cpu].core_sibling;
>  }
>  
> +/*
> + * The current assumption is that we can power gate each core independently.
> + * This will be superseded by DT binding once available.
> + */
> +const struct cpumask *cpu_corepower_mask(int cpu)
> +{
> +	return &cpu_topology[cpu].thread_sibling;
> +}
> +
>  static void update_siblings_masks(unsigned int cpuid)
>  {
>  	struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> @@ -266,6 +275,20 @@ void store_cpu_topology(unsigned int cpuid)
>  		cpu_topology[cpuid].socket_id, mpidr);
>  }
>  
> +static inline const int cpu_corepower_flags(void)
> +{
> +	return SD_SHARE_PKG_RESOURCES  | SD_SHARE_POWERDOMAIN;
> +}
> +
> +static struct sched_domain_topology_level arm_topology[] = {
> +#ifdef CONFIG_SCHED_MC
> +	{ cpu_corepower_mask, cpu_corepower_flags, SD_INIT_NAME(GMC) },
> +	{ cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
> +#endif
> +	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
> +	{ NULL, },
> +};
> +
>  /*
>   * init_cpu_topology is called at boot when only one cpu is running
>   * which prevent simultaneous write access to cpu_topology array
> @@ -289,4 +312,7 @@ void __init init_cpu_topology(void)
>  	smp_wmb();
>  
>  	parse_dt_topology();
> +
> +	/* Set scheduler topology descriptor */
> +	set_sched_topology(arm_topology);
>  }
> 



  reply	other threads:[~2014-04-23 11:46 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-11  9:44 [PATCH v4 0/5] rework sched_domain topology description Vincent Guittot
2014-04-11  9:44 ` Vincent Guittot
2014-04-11  9:44 ` [PATCH v4 1/5] sched: rework of sched_domain topology definition Vincent Guittot
2014-04-11  9:44   ` Vincent Guittot
2014-04-18 10:56   ` Peter Zijlstra
2014-04-18 10:56     ` Peter Zijlstra
2014-04-18 11:34     ` [PATCH] fix: " Vincent Guittot
2014-04-18 11:34       ` Vincent Guittot
2014-04-18 11:39       ` Peter Zijlstra
2014-04-18 11:39         ` Peter Zijlstra
2014-04-18 11:34     ` [PATCH v4 1/5] " Vincent Guittot
2014-04-18 11:34       ` Vincent Guittot
2014-05-08 10:43   ` [tip:sched/core] sched: Rework " tip-bot for Vincent Guittot
2014-05-16  9:57     ` James Hogan
2014-05-18  8:04       ` Vincent Guittot
2014-05-18  8:04         ` Vincent Guittot
2014-04-11  9:44 ` [PATCH v4 2/5] sched: s390: create a dedicated topology table Vincent Guittot
2014-04-11  9:44   ` Vincent Guittot
2014-05-08 10:43   ` [tip:sched/core] sched, s390: Create " tip-bot for Vincent Guittot
2014-04-11  9:44 ` [PATCH v4 3/5] sched: powerpc: create " Vincent Guittot
2014-04-11  9:44   ` Vincent Guittot
2014-05-08 10:43   ` [tip:sched/core] sched, powerpc: Create " tip-bot for Vincent Guittot
2014-04-11  9:44 ` [PATCH v4 4/5] sched: add a new SD_SHARE_POWERDOMAIN for sched_domain Vincent Guittot
2014-04-11  9:44   ` Vincent Guittot
2014-04-18 10:58   ` Peter Zijlstra
2014-04-18 10:58     ` Peter Zijlstra
2014-04-18 11:54     ` [PATCH] fix: sched: rework of sched_domain topology definition Vincent Guittot
2014-04-18 11:54       ` Vincent Guittot
2014-04-18 11:54     ` [PATCH v4 4/5] sched: add a new SD_SHARE_POWERDOMAIN for sched_domain Vincent Guittot
2014-04-18 11:54       ` Vincent Guittot
2014-05-08 10:44   ` [tip:sched/core] sched: Add " tip-bot for Vincent Guittot
2014-04-11  9:44 ` [PATCH v4 5/5] sched: ARM: create a dedicated scheduler topology table Vincent Guittot
2014-04-11  9:44   ` Vincent Guittot
2014-04-23 11:46   ` Dietmar Eggemann [this message]
2014-04-23 11:46     ` Dietmar Eggemann
2014-04-23 14:46     ` Vincent Guittot
2014-04-23 14:46       ` Vincent Guittot
2014-04-23 15:26       ` Dietmar Eggemann
2014-04-23 15:26         ` Dietmar Eggemann
2014-04-24  7:30         ` Vincent Guittot
2014-04-24  7:30           ` Vincent Guittot
2014-04-24 12:48           ` Dietmar Eggemann
2014-04-24 12:48             ` Dietmar Eggemann
2014-04-25  7:45             ` Vincent Guittot
2014-04-25  7:45               ` Vincent Guittot
2014-04-25 15:55               ` Dietmar Eggemann
2014-04-25 15:55                 ` Dietmar Eggemann
2014-04-25 16:04               ` Peter Zijlstra
2014-04-25 16:04                 ` Peter Zijlstra
2014-04-25 16:05                 ` Peter Zijlstra
2014-04-25 16:05                   ` Peter Zijlstra
2014-05-08 10:44   ` [tip:sched/core] sched, ARM: Create " tip-bot for Vincent Guittot
2014-04-12 12:56 ` [PATCH v4 0/5] rework sched_domain topology description Dietmar Eggemann
2014-04-12 12:56   ` Dietmar Eggemann
2014-04-14  7:29   ` Vincent Guittot
2014-04-14  7:29     ` Vincent Guittot
2014-04-15  7:53   ` Peter Zijlstra
2014-04-15  7:53     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5357A802.1030804@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.