public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Juri Lelli <juri.lelli@redhat.com>
To: Quentin Perret <quentin.perret@arm.com>
Cc: peterz@infradead.org, rjw@rjwysocki.net,
	gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org,
	linux-pm@vger.kernel.org, mingo@redhat.com,
	dietmar.eggemann@arm.com, morten.rasmussen@arm.com,
	chris.redpath@arm.com, patrick.bellasi@arm.com,
	valentin.schneider@arm.com, vincent.guittot@linaro.org,
	thara.gopinath@linaro.org, viresh.kumar@linaro.org,
	tkjos@google.com, joelaf@google.com, smuckle@google.com,
	adharmap@quicinc.com, skannan@quicinc.com,
	pkondeti@codeaurora.org, edubezval@gmail.com,
	srinivas.pandruvada@linux.intel.com, currojerez@riseup.net,
	javi.merino@kernel.org
Subject: Re: [RFC PATCH v3 05/10] sched/topology: Reference the Energy Model of CPUs when available
Date: Thu, 7 Jun 2018 16:44:22 +0200	[thread overview]
Message-ID: <20180607144422.GA17216@localhost.localdomain> (raw)
In-Reply-To: <20180521142505.6522-6-quentin.perret@arm.com>

Hi,

On 21/05/18 15:25, Quentin Perret wrote:
> In order to use EAS, the task scheduler has to know about the Energy
> Model (EM) of the platform. This commit extends the scheduler topology
> code to take references on the frequency domains objects of the EM
> framework for all online CPUs. Hence, the availability of the EM for
> those CPUs is guaranteed to the scheduler at runtime without further
> checks in latency sensitive code paths (i.e. task wake-up).
> 
> A (RCU-protected) private list of online frequency domains is maintained
> by the scheduler to enable fast iterations. Furthermore, the availability
> of an EM is notified to the rest of the scheduler with a static key,
> which ensures a low impact on non-EAS systems.
> 
> Energy Aware Scheduling can be started if and only if:
>    1. all online CPUs are covered by the EM;
>    2. the EM complexity is low enough to keep scheduling overheads low;
>    3. the platform has an asymmetric CPU capacity topology (detected by
>       looking for the SD_ASYM_CPUCAPACITY flag in the sched_domain
>       hierarchy).

Not sure about this. How about multi-freq domain same max capacity
systems. I understand that most of the energy saving come from selecting
the right (big/LITTLE) cluster, but EM should still be useful to drive
OPP selection (that was one of the use-cases we discussed lately IIRC)
and also to decide between packing or spreading, no?

> The sched_energy_enabled() function which returns the status of the
> static key is stubbed to false when CONFIG_ENERGY_MODEL=n, hence making
> sure that all the code behind it can be compiled out by constant
> propagation.

Actually, do we need a config option at all? Shouldn't the static key
(and RCU machinery) guard against unwanted overheads when EM is not
present/used?

I was thinking it should be pretty similar to schedutil setup, no?

> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> ---
>  kernel/sched/sched.h    |  27 ++++++++++
>  kernel/sched/topology.c | 113 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 140 insertions(+)
> 
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index ce562d3b7526..7c517076a74a 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -63,6 +63,7 @@
>  #include <linux/syscalls.h>
>  #include <linux/task_work.h>
>  #include <linux/tsacct_kern.h>
> +#include <linux/energy_model.h>
>  
>  #include <asm/tlb.h>
>  
> @@ -2162,3 +2163,29 @@ static inline unsigned long cpu_util_cfs(struct rq *rq)
>  	return util;
>  }
>  #endif
> +
> +struct sched_energy_fd {
> +	struct em_freq_domain *fd;
> +	struct list_head next;
> +	struct rcu_head rcu;
> +};
> +
> +#ifdef CONFIG_ENERGY_MODEL
> +extern struct static_key_false sched_energy_present;
> +static inline bool sched_energy_enabled(void)
> +{
> +	return static_branch_unlikely(&sched_energy_present);
> +}
> +
> +extern struct list_head sched_energy_fd_list;
> +#define for_each_freq_domain(sfd) \
> +		list_for_each_entry_rcu(sfd, &sched_energy_fd_list, next)
> +#define freq_domain_span(sfd) (&((sfd)->fd->cpus))
> +#else
> +static inline bool sched_energy_enabled(void)
> +{
> +	return false;
> +}
> +#define for_each_freq_domain(sfd) for (sfd = NULL; sfd;)
> +#define freq_domain_span(sfd) NULL
> +#endif
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 64cc564f5255..3e22c798f18d 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1500,6 +1500,116 @@ void sched_domains_numa_masks_clear(unsigned int cpu)
>  
>  #endif /* CONFIG_NUMA */
>  
> +#ifdef CONFIG_ENERGY_MODEL
> +
> +/*
> + * The complexity of the Energy Model is defined as the product of the number
> + * of frequency domains with the sum of the number of CPUs and the total
> + * number of OPPs in all frequency domains. It is generally not a good idea
> + * to use such a model on very complex platform because of the associated
> + * scheduling overheads. The arbitrary constraint below prevents that. It
> + * makes EAS usable up to 16 CPUs with per-CPU DVFS and less than 8 OPPs each,
> + * for example.
> + */
> +#define EM_MAX_COMPLEXITY 2048

Do we really need this hardcoded constant?

I guess if one spent time deriving an EM for a big system with lot of
OPPs, she/he already knows what is doing? :)

> +
> +DEFINE_STATIC_KEY_FALSE(sched_energy_present);
> +LIST_HEAD(sched_energy_fd_list);
> +
> +static struct sched_energy_fd *find_sched_energy_fd(int cpu)
> +{
> +	struct sched_energy_fd *sfd;
> +
> +	for_each_freq_domain(sfd) {
> +		if (cpumask_test_cpu(cpu, freq_domain_span(sfd)))
> +			return sfd;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void free_sched_energy_fd(struct rcu_head *rp)
> +{
> +	struct sched_energy_fd *sfd;
> +
> +	sfd = container_of(rp, struct sched_energy_fd, rcu);
> +	kfree(sfd);
> +}
> +
> +static void build_sched_energy(void)
> +{
> +	struct sched_energy_fd *sfd, *tmp;
> +	struct em_freq_domain *fd;
> +	struct sched_domain *sd;
> +	int cpu, nr_fd = 0, nr_opp = 0;
> +
> +	rcu_read_lock();
> +
> +	/* Disable EAS entirely whenever the system isn't asymmetric. */
> +	cpu = cpumask_first(cpu_online_mask);
> +	sd = lowest_flag_domain(cpu, SD_ASYM_CPUCAPACITY);
> +	if (!sd) {
> +		pr_debug("%s: no SD_ASYM_CPUCAPACITY\n", __func__);
> +		goto disable;
> +	}
> +
> +	/* Make sure to have an energy model for all CPUs. */
> +	for_each_online_cpu(cpu) {
> +		/* Skip CPUs with a known energy model. */
> +		sfd = find_sched_energy_fd(cpu);
> +		if (sfd)
> +			continue;
> +
> +		/* Add the energy model of others. */
> +		fd = em_cpu_get(cpu);
> +		if (!fd)
> +			goto disable;
> +		sfd = kzalloc(sizeof(*sfd), GFP_NOWAIT);
> +		if (!sfd)
> +			goto disable;
> +		sfd->fd = fd;
> +		list_add_rcu(&sfd->next, &sched_energy_fd_list);
> +	}
> +
> +	list_for_each_entry_safe(sfd, tmp, &sched_energy_fd_list, next) {
> +		if (cpumask_intersects(freq_domain_span(sfd),
> +							cpu_online_mask)) {
> +			nr_opp += em_fd_nr_cap_states(sfd->fd);
> +			nr_fd++;
> +			continue;
> +		}
> +
> +		/* Remove the unused frequency domains */
> +		list_del_rcu(&sfd->next);
> +		call_rcu(&sfd->rcu, free_sched_energy_fd);

Unused because of? Hotplug?

Not sure, but I guess you have considered the idea of tearing all this
down when sched domains are destroied and then rebuilding it again? Why
did you decide for this approach? Or maybe I just missed where you do
that. :/

> +	}
> +
> +	/* Bail out if the Energy Model complexity is too high. */
> +	if (nr_fd * (nr_opp + num_online_cpus()) > EM_MAX_COMPLEXITY) {
> +		pr_warn("%s: EM complexity too high, stopping EAS", __func__);
> +		goto disable;
> +	}
> +
> +	rcu_read_unlock();
> +	static_branch_enable_cpuslocked(&sched_energy_present);
> +	pr_debug("%s: EAS started\n", __func__);

I'd vote for a pr_info here instead, maybe printing info about the em as
well. Looks pretty useful to me to have that in dmesg. Maybe guarded by
sched_debug?

Best,

- Juri

  reply	other threads:[~2018-06-07 14:46 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-21 14:24 [RFC PATCH v3 00/10] Energy Aware Scheduling Quentin Perret
2018-05-21 14:24 ` [RFC PATCH v3 01/10] sched: Relocate arch_scale_cpu_capacity Quentin Perret
2018-05-21 14:24 ` [RFC PATCH v3 02/10] sched/cpufreq: Factor out utilization to frequency mapping Quentin Perret
2018-05-21 14:24 ` [RFC PATCH v3 03/10] PM: Introduce an Energy Model management framework Quentin Perret
2018-06-06 13:12   ` Dietmar Eggemann
2018-06-06 14:37     ` Quentin Perret
2018-06-06 15:20       ` Juri Lelli
2018-06-06 15:29         ` Quentin Perret
2018-06-06 16:26           ` Quentin Perret
2018-06-07 15:58             ` Dietmar Eggemann
2018-06-08 13:39             ` Javi Merino
2018-06-08 15:47               ` Quentin Perret
2018-06-09  8:24                 ` Javi Merino
2018-06-06 16:47   ` Juri Lelli
2018-06-06 16:59     ` Quentin Perret
2018-06-07 14:44   ` Juri Lelli
2018-06-07 15:19     ` Quentin Perret
2018-06-07 15:55       ` Dietmar Eggemann
2018-06-08  8:25         ` Quentin Perret
2018-06-08  9:36           ` Juri Lelli
2018-06-08 10:31             ` Quentin Perret
2018-06-08 12:39           ` Dietmar Eggemann
2018-06-08 13:11             ` Quentin Perret
2018-06-08 16:39               ` Dietmar Eggemann
2018-06-08 17:02                 ` Quentin Perret
2018-06-07 16:04       ` Juri Lelli
2018-06-07 17:31         ` Quentin Perret
2018-06-09  8:13         ` Javi Merino
2018-06-19 11:07   ` Peter Zijlstra
2018-06-19 12:35     ` Quentin Perret
2018-06-19 11:31   ` Peter Zijlstra
2018-06-19 12:40     ` Quentin Perret
2018-06-19 11:34   ` Peter Zijlstra
2018-06-19 12:58     ` Quentin Perret
2018-06-19 13:23       ` Peter Zijlstra
2018-06-19 13:38         ` Quentin Perret
2018-06-19 14:16           ` Peter Zijlstra
2018-06-19 14:21             ` Peter Zijlstra
2018-06-19 14:30               ` Peter Zijlstra
2018-06-19 14:23             ` Quentin Perret
2018-05-21 14:24 ` [RFC PATCH v3 04/10] PM / EM: Expose the Energy Model in sysfs Quentin Perret
2018-06-19 12:16   ` Peter Zijlstra
2018-06-19 13:06     ` Quentin Perret
2018-05-21 14:25 ` [RFC PATCH v3 05/10] sched/topology: Reference the Energy Model of CPUs when available Quentin Perret
2018-06-07 14:44   ` Juri Lelli [this message]
2018-06-07 16:02     ` Quentin Perret
2018-06-07 16:29       ` Juri Lelli
2018-06-07 17:26         ` Quentin Perret
2018-06-19 12:26   ` Peter Zijlstra
2018-06-19 13:24     ` Quentin Perret
2018-06-19 16:20       ` Peter Zijlstra
2018-06-19 17:13         ` Quentin Perret
2018-06-19 18:42           ` Peter Zijlstra
2018-06-20  7:58             ` Quentin Perret
2018-05-21 14:25 ` [RFC PATCH v3 06/10] sched: Add over-utilization/tipping point indicator Quentin Perret
2018-06-19  7:01   ` Pavan Kondeti
2018-06-19 10:26     ` Dietmar Eggemann
2018-05-21 14:25 ` [RFC PATCH v3 07/10] sched/fair: Introduce an energy estimation helper function Quentin Perret
2018-06-08 10:30   ` Juri Lelli
2018-06-19  9:51   ` Pavan Kondeti
2018-06-19  9:53     ` Quentin Perret
2018-05-21 14:25 ` [RFC PATCH v3 08/10] sched: Lowest energy aware balancing sched_domain level pointer Quentin Perret
2018-05-21 14:25 ` [RFC PATCH v3 09/10] sched/fair: Select an energy-efficient CPU on task wake-up Quentin Perret
2018-06-08 10:24   ` Juri Lelli
2018-06-08 11:19     ` Quentin Perret
2018-06-08 11:59       ` Juri Lelli
2018-06-08 16:26         ` Quentin Perret
2018-06-19  5:06   ` Pavan Kondeti
2018-06-19  7:57     ` Quentin Perret
2018-06-19  8:41       ` Pavan Kondeti
2018-05-21 14:25 ` [RFC PATCH v3 10/10] arch_topology: Start Energy Aware Scheduling Quentin Perret
2018-06-19  9:18   ` Pavan Kondeti
2018-06-19  9:40     ` Quentin Perret
2018-06-19  9:47       ` Juri Lelli
2018-06-19 10:02         ` Quentin Perret
2018-06-19 10:19           ` Juri Lelli
2018-06-19 10:25             ` Quentin Perret
2018-06-19 10:31               ` Juri Lelli
2018-06-19 10:49                 ` Quentin Perret
2018-06-01  9:29 ` [RFC PATCH v3 00/10] " Quentin Perret

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180607144422.GA17216@localhost.localdomain \
    --to=juri.lelli@redhat.com \
    --cc=adharmap@quicinc.com \
    --cc=chris.redpath@arm.com \
    --cc=currojerez@riseup.net \
    --cc=dietmar.eggemann@arm.com \
    --cc=edubezval@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=javi.merino@kernel.org \
    --cc=joelaf@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=patrick.bellasi@arm.com \
    --cc=peterz@infradead.org \
    --cc=pkondeti@codeaurora.org \
    --cc=quentin.perret@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=skannan@quicinc.com \
    --cc=smuckle@google.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=thara.gopinath@linaro.org \
    --cc=tkjos@google.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox