All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Morten Rasmussen <morten.rasmussen@arm.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Sai Gurrappadi <sgurrappadi@nvidia.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"yuyang.du@intel.com" <yuyang.du@intel.com>,
	"preeti@linux.vnet.ibm.com" <preeti@linux.vnet.ibm.com>,
	"mturquette@linaro.org" <mturquette@linaro.org>,
	"nico@linaro.org" <nico@linaro.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	Juri Lelli <Juri.Lelli@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Peter Boonstoppel <pboonstoppel@nvidia.com>
Subject: Re: [RFCv3 PATCH 30/48] sched: Calculate energy consumption of sched_group
Date: Thu, 26 Mar 2015 15:23:38 +0000	[thread overview]
Message-ID: <5514247A.1090009@arm.com> (raw)
In-Reply-To: <20150324173955.GI18994@e105550-lin.cambridge.arm.com>

On 24/03/15 17:39, Morten Rasmussen wrote:
> On Tue, Mar 24, 2015 at 04:10:37PM +0000, Peter Zijlstra wrote:
>> On Tue, Mar 24, 2015 at 10:44:24AM +0000, Morten Rasmussen wrote:
>>>>> Maybe remind us why this needs to be tied to sched_groups ? Why can't we
>>>>> attach the energy information to the domains?
>>
>>> In the current domain hierarchy you don't have domains with just one cpu
>>> in them. If you attach the per-cpu energy data to the MC level domain
>>> which spans the whole cluster, you break the current idea of attaching
>>> information to the cpumask (currently sched_group, but could be
>>> sched_domain as we discuss here) the information is associated with. You
>>> would have to either introduce a level of single cpu domains at the
>>> lowest level or move away from the idea of attaching data to the cpumask
>>> that is associated with it.
>>>
>>> Using sched_groups we do already have single cpu groups that we can
>>> attach per-cpu data to, but we are missing a top level group spanning
>>> the entire system for system wide energy data. So from that point of
>>> view groups and domains are equally bad.
>>
>> Oh urgh, good point that. Cursed if you do, cursed if you don't. Bugger.
> 
> Yeah :( I don't really care which one we choose. Adding another top
> level domain with one big group spanning all cpus, but with all SD flags
> disabled seems less intrusive than adding a level at the bottom.
> 
> Better ideas are very welcome.
> 

I had a stab at integrating such a top level (SYS) domain w/ all known SD
flags disabled. This SYS sd exposes itself w/ all counters set to 0 in
/proc/schedstat.

There're still some kludges in the patch blow:

- The need for a new topology SD flag to tell sd_init() that we want to
  reset the default sd configuration. 
- Don't break in build_sched_domains() at the first sd spanning cpu_map
- Don't decay newidle max times in rebalance_domains() by bailing early 
  on SYS sd.

It survived booting on single (MC-SYS) and dual cluster ARM (MC-DIE-SYS)
systems.
Would something like this be acceptable?

diff --git a/include/linux/sched.h b/include/linux/sched.h
index f984b4e58865..8fbc9976f5d1 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -904,6 +904,7 @@ enum cpu_idle_type {
 #define SD_BALANCE_FORK                0x0008  /* Balance on fork, clone */
 #define SD_BALANCE_WAKE                0x0010  /* Balance on wakeup */
 #define SD_WAKE_AFFINE         0x0020  /* Wake task to waking CPU */
+#define SD_SHARE_ENERGY                0x0040  /* System-wide energy data */
 #define SD_SHARE_CPUCAPACITY   0x0080  /* Domain members share cpu power */
 #define SD_SHARE_POWERDOMAIN   0x0100  /* Domain members share power domain */
 #define SD_SHARE_PKG_RESOURCES 0x0200  /* Domain members share cpu pkg resources */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4f52c2e7484e..d058dc1e639f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5529,7 +5529,7 @@ static int sd_degenerate(struct sched_domain *sd)
        }
 
        /* Following flags don't use groups */
-       if (sd->flags & (SD_WAKE_AFFINE))
+       if (sd->flags & (SD_WAKE_AFFINE | SD_SHARE_ENERGY))
                return 0;
 
        return 1;
@@ -6215,8 +6215,9 @@ static int sched_domains_curr_level;
  * SD_SHARE_POWERDOMAIN   - describes shared power domain
  * SD_SHARE_CAP_STATES    - describes shared capacity states
  *
- * Odd one out:
+ * Odd two out:
  * SD_ASYM_PACKING        - describes SMT quirks
+ * SD_SHARE_ENERGY        - describes EAS quirks
  */
 #define TOPOLOGY_SD_FLAGS              \
        (SD_SHARE_CPUCAPACITY |         \
@@ -6224,7 +6225,8 @@ static int sched_domains_curr_level;
         SD_NUMA |                      \
         SD_ASYM_PACKING |              \
         SD_SHARE_POWERDOMAIN |         \
-        SD_SHARE_CAP_STATES)
+        SD_SHARE_CAP_STATES |          \
+        SD_SHARE_ENERGY)
 
 static struct sched_domain *
 sd_init(struct sched_domain_topology_level *tl, int cpu)
@@ -6298,6 +6300,14 @@ sd_init(struct sched_domain_topology_level *tl, int cpu)
                sd->cache_nice_tries = 1;
                sd->busy_idx = 2;
 
+       } else if (sd->flags & SD_SHARE_ENERGY) {
+               /* Reset the default configuration completely */
+               memset(sd, 0, sizeof(*sd));
+
+               sd->flags = 1*SD_SHARE_ENERGY;
+#ifdef CONFIG_SCHED_DEBUG
+               sd->name = tl->name;
+#endif
 #ifdef CONFIG_NUMA
        } else if (sd->flags & SD_NUMA) {
                sd->cache_nice_tries = 2;
@@ -6826,8 +6836,6 @@ static int build_sched_domains(const struct cpumask *cpu_map,
                                *per_cpu_ptr(d.sd, i) = sd;
                        if (tl->flags & SDTL_OVERLAP || sched_feat(FORCE_SD_OVERLAP))
                                sd->flags |= SD_OVERLAP;
-                       if (cpumask_equal(cpu_map, sched_domain_span(sd)))
-                               break;
                }
        }
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cfe65aec3237..8d4cc72f4778 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8073,6 +8073,10 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle)
 
        rcu_read_lock();
        for_each_domain(cpu, sd) {
+
+               if (sd->flags & SD_SHARE_ENERGY)
+                       continue;
+
                /*
                 * Decay the newidle max times here because this is a regular
                 * visit to all the domains. Decay ~1% per second.


  reply	other threads:[~2015-03-26 15:23 UTC|newest]

Thread overview: 124+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-04 18:30 [RFCv3 PATCH 00/48] sched: Energy cost model for energy-aware scheduling Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 01/48] sched: add utilization_avg_contrib Morten Rasmussen
2015-02-11  8:50   ` Preeti U Murthy
2015-02-12  1:07     ` Vincent Guittot
2015-02-04 18:30 ` [RFCv3 PATCH 02/48] sched: Track group sched_entity usage contributions Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 03/48] sched: remove frequency scaling from cpu_capacity Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 04/48] sched: Make sched entity usage tracking frequency-invariant Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 05/48] sched: make scale_rt invariant with frequency Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 06/48] sched: add per rq cpu_capacity_orig Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 07/48] sched: get CPU's usage statistic Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 08/48] sched: replace capacity_factor by usage Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 09/48] sched: add SD_PREFER_SIBLING for SMT level Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 10/48] sched: move cfs task on a CPU with higher capacity Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 11/48] sched: Make load tracking frequency scale-invariant Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 12/48] sched: Make usage tracking cpu scale-invariant Morten Rasmussen
2015-03-23 14:46   ` Peter Zijlstra
2015-03-23 19:19     ` Dietmar Eggemann
     [not found]       ` <OF8A3E3617.0D4400A5-ON48257E3A.001B38D9-48257E3A.002379A4@zte.com.cn>
2015-05-06  9:49         ` Dietmar Eggemann
2015-02-04 18:30 ` [RFCv3 PATCH 13/48] cpufreq: Architecture specific callback for frequency changes Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 14/48] arm: Frequency invariant scheduler load-tracking support Morten Rasmussen
2015-03-23 13:39   ` Peter Zijlstra
2015-03-24  9:41     ` Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 15/48] arm: vexpress: Add CPU clock-frequencies to TC2 device-tree Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 16/48] arm: Cpu invariant scheduler load-tracking support Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 17/48] sched: Get rid of scaling usage by cpu_capacity_orig Morten Rasmussen
     [not found]   ` <OFFC493540.15A92099-ON48257E35.0026F60C-48257E35.0027A5FB@zte.com.cn>
2015-04-28 16:54     ` Dietmar Eggemann
2015-02-04 18:30 ` [RFCv3 PATCH 18/48] sched: Track blocked utilization contributions Morten Rasmussen
2015-03-23 14:08   ` Peter Zijlstra
2015-03-24  9:43     ` Morten Rasmussen
2015-03-24 16:07       ` Peter Zijlstra
2015-03-24 17:44         ` Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 19/48] sched: Include blocked utilization in usage tracking Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 20/48] sched: Documentation for scheduler energy cost model Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 21/48] sched: Make energy awareness a sched feature Morten Rasmussen
2015-02-04 18:30 ` [RFCv3 PATCH 22/48] sched: Introduce energy data structures Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 23/48] sched: Allocate and initialize " Morten Rasmussen
     [not found]   ` <OF29F384AC.37929D8E-ON48257E35.002FCB0C-48257E35.003156FE@zte.com.cn>
2015-04-29 15:43     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 24/48] sched: Introduce SD_SHARE_CAP_STATES sched_domain flag Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 25/48] arm: topology: Define TC2 energy and provide it to the scheduler Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 26/48] sched: Compute cpu capacity available at current frequency Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 27/48] sched: Relocated get_cpu_usage() Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 28/48] sched: Use capacity_curr to cap utilization in get_cpu_usage() Morten Rasmussen
2015-03-23 16:14   ` Peter Zijlstra
2015-03-24 11:36     ` Morten Rasmussen
2015-03-24 12:59       ` Peter Zijlstra
2015-02-04 18:31 ` [RFCv3 PATCH 29/48] sched: Highest energy aware balancing sched_domain level pointer Morten Rasmussen
2015-03-23 16:16   ` Peter Zijlstra
2015-03-24 10:52     ` Morten Rasmussen
     [not found]   ` <OF5977496A.A21A7B96-ON48257E35.002EC23C-48257E35.00324DAD@zte.com.cn>
2015-04-29 15:54     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 30/48] sched: Calculate energy consumption of sched_group Morten Rasmussen
2015-03-13 22:54   ` Sai Gurrappadi
2015-03-16 14:15     ` Morten Rasmussen
2015-03-23 16:47       ` Peter Zijlstra
2015-03-23 20:21         ` Dietmar Eggemann
2015-03-24 10:44           ` Morten Rasmussen
2015-03-24 16:10             ` Peter Zijlstra
2015-03-24 17:39               ` Morten Rasmussen
2015-03-26 15:23                 ` Dietmar Eggemann [this message]
2015-03-20 18:40   ` Sai Gurrappadi
2015-03-27 15:58     ` Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 31/48] sched: Extend sched_group_energy to test load-balancing decisions Morten Rasmussen
     [not found]   ` <OF081FBA75.F80B8844-ON48257E37.00261E89-48257E37.00267F24@zte.com.cn>
2015-04-30 20:26     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 32/48] sched: Estimate energy impact of scheduling decisions Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 33/48] sched: Energy-aware wake-up task placement Morten Rasmussen
2015-03-13 22:47   ` Sai Gurrappadi
2015-03-16 14:47     ` Morten Rasmussen
2015-03-18 20:15       ` Sai Gurrappadi
2015-03-27 16:37         ` Morten Rasmussen
2015-03-24 13:00       ` Peter Zijlstra
2015-03-24 15:24         ` Morten Rasmussen
2015-03-24 13:00   ` Peter Zijlstra
2015-03-24 15:42     ` Morten Rasmussen
2015-03-24 15:53       ` Peter Zijlstra
2015-03-24 17:47         ` Morten Rasmussen
2015-03-24 16:35   ` Peter Zijlstra
2015-03-25 18:01     ` Juri Lelli
2015-03-25 18:14       ` Peter Zijlstra
2015-03-26 10:21         ` Juri Lelli
2015-03-26 10:41           ` Peter Zijlstra
2015-04-27 16:01             ` Michael Turquette
2015-04-28 13:06               ` Peter Zijlstra
2015-02-04 18:31 ` [RFCv3 PATCH 34/48] sched: Bias new task wakeups towards higher capacity cpus Morten Rasmussen
2015-03-24 13:33   ` Peter Zijlstra
2015-03-25 18:18     ` Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 35/48] sched, cpuidle: Track cpuidle state index in the scheduler Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 36/48] sched: Count number of shallower idle-states in struct sched_group_energy Morten Rasmussen
2015-03-24 13:14   ` Peter Zijlstra
2015-03-24 17:13     ` Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 37/48] sched: Determine the current sched_group idle-state Morten Rasmussen
     [not found]   ` <OF1FDC99CD.22435E74-ON48257E37.001BA739-48257E37.001CA5ED@zte.com.cn>
2015-04-30 20:17     ` Dietmar Eggemann
     [not found]       ` <OF2F4202E4.8A4AF229-ON48257E38.00312CD4-48257E38.0036ADB6@zte.com.cn>
2015-05-01 15:09         ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 38/48] sched: Infrastructure to query if load balancing is energy-aware Morten Rasmussen
2015-03-24 13:41   ` Peter Zijlstra
2015-03-24 16:17     ` Dietmar Eggemann
2015-03-24 13:56   ` Peter Zijlstra
2015-03-24 16:22     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 39/48] sched: Introduce energy awareness into update_sg_lb_stats Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 40/48] sched: Introduce energy awareness into update_sd_lb_stats Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 41/48] sched: Introduce energy awareness into find_busiest_group Morten Rasmussen
2015-02-04 18:31 ` [RFCv3 PATCH 42/48] sched: Introduce energy awareness into find_busiest_queue Morten Rasmussen
2015-03-24 15:21   ` Peter Zijlstra
2015-03-24 18:04     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 43/48] sched: Introduce energy awareness into detach_tasks Morten Rasmussen
2015-03-24 15:25   ` Peter Zijlstra
2015-03-25 23:50   ` Sai Gurrappadi
2015-03-27 15:03     ` Dietmar Eggemann
     [not found]       ` <OFDCE15EEF.2F536D7F-ON48257E37.002565ED-48257E37.0027A8B9@zte.com.cn>
2015-04-30 20:35         ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 44/48] sched: Tipping point from energy-aware to conventional load balancing Morten Rasmussen
2015-03-24 15:26   ` Peter Zijlstra
2015-03-24 18:47     ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 45/48] sched: Skip cpu as lb src which has one task and capacity gte the dst cpu Morten Rasmussen
2015-03-24 15:27   ` Peter Zijlstra
2015-03-25 18:44     ` Dietmar Eggemann
     [not found]       ` <OF9320540C.255228F9-ON48257E37.002A02D1-48257E37.002AB5EE@zte.com.cn>
2015-05-05 10:01         ` Dietmar Eggemann
2015-02-04 18:31 ` [RFCv3 PATCH 46/48] sched: Turn off fast idling of cpus on a partially loaded system Morten Rasmussen
2015-03-24 16:01   ` Peter Zijlstra
2015-02-04 18:31 ` [RFCv3 PATCH 47/48] sched: Enable active migration for cpus of lower capacity Morten Rasmussen
2015-03-24 16:02   ` Peter Zijlstra
2015-02-04 18:31 ` [RFCv3 PATCH 48/48] sched: Disable energy-unfriendly nohz kicks Morten Rasmussen
2015-02-20 19:26   ` Dietmar Eggemann
2015-04-02 12:43 ` [RFCv3 PATCH 00/48] sched: Energy cost model for energy-aware scheduling Vincent Guittot
2015-04-08 13:33   ` Morten Rasmussen
2015-04-09  7:41     ` Vincent Guittot
2015-04-10 14:46       ` Morten Rasmussen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5514247A.1090009@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=Juri.Lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=mturquette@linaro.org \
    --cc=nico@linaro.org \
    --cc=pboonstoppel@nvidia.com \
    --cc=peterz@infradead.org \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rjw@rjwysocki.net \
    --cc=sgurrappadi@nvidia.com \
    --cc=vincent.guittot@linaro.org \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.