From mboxrd@z Thu Jan 1 00:00:00 1970 From: mtitinger@baylibre.com (Marc Titinger) Date: Thu, 19 Nov 2015 09:50:30 +0100 Subject: [PATCH RFC 18/27] drivers: cpu-pd: Add PM Domain governor for CPUs In-Reply-To: <20151118184245.GA28755@red-moon> References: <1447799871-56374-1-git-send-email-lina.iyer@linaro.org> <1447799871-56374-19-git-send-email-lina.iyer@linaro.org> <20151118184245.GA28755@red-moon> Message-ID: <564D8D56.1080206@baylibre.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 18/11/2015 19:42, Lorenzo Pieralisi wrote: > On Tue, Nov 17, 2015 at 03:37:42PM -0700, Lina Iyer wrote: >> A PM domain comprising of CPUs may be powered off when all the CPUs in >> the domain are powered down. Powering down a CPU domain is generally a >> expensive operation and therefore the power performance trade offs >> should be considered. The time between the last CPU powering down and >> the first CPU powering up in a domain, is the time available for the >> domain to sleep. Ideally, the sleep time of the domain should fulfill >> the residency requirement of the domains' idle state. >> >> To do this effectively, read the time before the wakeup of the cluster's >> CPUs and ensure that the domain's idle state sleep time guarantees the >> QoS requirements of each of the CPU, the PM QoS CPU_DMA_LATENCY and the >> state's residency. > > To me this information should be part of the CPUidle governor (it is > already there), we should not split the decision into multiple layers. > > The problem you are facing is that the CPUidle governor(s) do not take > cross cpus relationship into account, I do not think that adding another > decision layer in the power domain subsystem helps, you are doing that > just because adding it to the existing CPUidle governor(s) is invasive. > > Why can't we use the power domain work you put together to eg disable > idle states that share multiple cpus and make them "visible" only > when the power domain that encompass them is actually going down ? > > You could use the power domains information to detect states that > are shared between cpus. > > It is just an idea, what I am saying is that having another governor in > the power domain subsytem does not make much sense, you split the > decision in two layers while there is actually one, the existing > CPUidle governor and that's where the decision should be taken. > > Thoughts appreciated. Maybe this is silly and not thought-through, but I wonder if the responsibilities could be split or instance with an outer control loop that has the heuristic to compute the next tick time, and the required cpu-power needed during that time slot, and an inner control loop (genpd) that has a per-domain QoS and can optimize power consumption. Marc. > > Lorenzo > >> Signed-off-by: Lina Iyer >> --- >> drivers/base/power/cpu-pd.c | 83 ++++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 82 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/base/power/cpu-pd.c b/drivers/base/power/cpu-pd.c >> index 617ce54..a00abc1 100644 >> --- a/drivers/base/power/cpu-pd.c >> +++ b/drivers/base/power/cpu-pd.c >> @@ -21,6 +21,7 @@ >> #include >> #include >> #include >> +#include >> >> #define CPU_PD_NAME_MAX 36 >> >> @@ -66,6 +67,86 @@ static void get_cpus_in_domain(struct generic_pm_domain *genpd, >> } >> } >> >> +static bool cpu_pd_down_ok(struct dev_pm_domain *pd) >> +{ >> + struct generic_pm_domain *genpd = pd_to_genpd(pd); >> + struct cpu_pm_domain *cpu_pd = to_cpu_pd(genpd); >> + int qos = pm_qos_request(PM_QOS_CPU_DMA_LATENCY); >> + u64 sleep_ns = ~0; >> + ktime_t earliest; >> + int cpu; >> + int i; >> + >> + /* Reset the last set genpd state, default to index 0 */ >> + genpd->state_idx = 0; >> + >> + /* We dont want to power down, if QoS is 0 */ >> + if (!qos) >> + return false; >> + >> + /* >> + * Find the sleep time for the cluster. >> + * The time between now and the first wake up of any CPU that >> + * are in this domain hierarchy is the time available for the >> + * domain to be idle. >> + */ >> + earliest.tv64 = KTIME_MAX; >> + for_each_cpu_and(cpu, cpu_pd->cpus, cpu_online_mask) { >> + struct device *cpu_dev = get_cpu_device(cpu); >> + struct gpd_timing_data *td; >> + >> + td = &dev_gpd_data(cpu_dev)->td; >> + >> + if (earliest.tv64 < td->next_wakeup.tv64) >> + earliest = td->next_wakeup; >> + } >> + >> + sleep_ns = ktime_to_ns(ktime_sub(earliest, ktime_get())); >> + if (sleep_ns <= 0) >> + return false; >> + >> + /* >> + * Find the deepest sleep state that satisfies the residency >> + * requirement and the QoS constraint >> + */ >> + for (i = genpd->state_count - 1; i > 0; i--) { >> + u64 state_sleep_ns; >> + >> + state_sleep_ns = genpd->states[i].power_off_latency_ns + >> + genpd->states[i].power_on_latency_ns + >> + genpd->states[i].residency_ns; >> + >> + /* >> + * If we cant sleep to save power in the state, move on >> + * to the next lower idle state. >> + */ >> + if (state_sleep_ns > sleep_ns) >> + continue; >> + >> + /* >> + * We also dont want to sleep more than we should to >> + * gaurantee QoS. >> + */ >> + if (state_sleep_ns < (qos * NSEC_PER_USEC)) >> + break; >> + } >> + >> + if (i >= 0) >> + genpd->state_idx = i; >> + >> + return (i >= 0) ? true : false; >> +} >> + >> +static bool cpu_stop_ok(struct device *dev) >> +{ >> + return true; >> +} >> + >> +struct dev_power_governor cpu_pd_gov = { >> + .power_down_ok = cpu_pd_down_ok, >> + .stop_ok = cpu_stop_ok, >> +}; >> + >> static int cpu_pd_power_off(struct generic_pm_domain *genpd) >> { >> struct cpu_pm_domain *pd = to_cpu_pd(genpd); >> @@ -183,7 +264,7 @@ int of_register_cpu_pm_domain(struct device_node *dn, >> >> /* Register the CPU genpd */ >> pr_debug("adding %s as CPU PM domain.\n", pd->genpd->name); >> - ret = of_pm_genpd_init(dn, pd->genpd, &simple_qos_governor, false); >> + ret = of_pm_genpd_init(dn, pd->genpd, &cpu_pd_gov, false); >> if (ret) { >> pr_err("Unable to initialize domain %s\n", dn->full_name); >> return ret; >> -- >> 2.1.4 >>