From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brendan Jackman Subject: [draft] Re: [PATCH 02/14] dt/bindings: update binding for PM domain idle states Date: Wed, 27 Jul 2016 12:11:24 +0100 Message-ID: <579896DC.1040607@arm.com> References: <1466624209-27432-1-git-send-email-lina.iyer@linaro.org> <1466624209-27432-3-git-send-email-lina.iyer@linaro.org> <20160623173312.GA22204@leverpostej> <20160623180451.GD1115@linaro.org> <20160623181927.GB31170@leverpostej> <20160623183939.GE1115@linaro.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------030900010906090709090600" Return-path: Received: from eu-smtp-delivery-143.mimecast.com ([146.101.78.143]:60457 "EHLO eu-smtp-delivery-143.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753432AbcG0LLp (ORCPT ); Wed, 27 Jul 2016 07:11:45 -0400 In-Reply-To: <20160623183939.GE1115@linaro.org> Sender: linux-arm-msm-owner@vger.kernel.org List-Id: linux-arm-msm@vger.kernel.org To: Sudeep Holla Cc: SSG-SW-Arch-Dev-Power@arm.com, k.kozlowski@samsung.com, andy.gross@linaro.org, sboyd@codeaurora.org, linux-arm-msm@vger.kernel.org, lorenzo.pieralisi@arm.com --------------030900010906090709090600 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed Content-Transfer-Encoding: quoted-printable Hi folks, I'm trying to start a conversation with Lina about using the power-domains DT bindings for describing the EAS power model. Partly because I'm still waiting for fossmail approval so I can't send this email to anyone without an @arm.com address, I wanted to circulate a draft internally before I make a fool of myself pubicly; does this mail make sense to you guys (also please let me know if I'm egregiously breaking linux-pm etiquette)? Cheers, Brendan =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Hi Lina, Mark, I'm currently investigating a generic way of expressing in DT the power costs of CPU idle states, for use by the energy model for Energy Aware Scheduling (EAS) [1][2]. Essentially what I want to do is probably add a "power-usage" property to idle state nodes. I have heard that you and/or Kevin Hilman have actually experimented with something similar. However, the model is required to be topologically aware: for each idle state, we want to know the power usage of the CPU, as well as of the cluster-level resources it shares. This has led me to this patchset, since the CPU power domain bindings allow you to tell which topology level "owns" an idle state. I guess my point here is that these bindings have another use-case besides OS-coordinated cluster idle. I've hacked up a patch (attached) for my experimentation which allows the bindings as described in this mail to be used without any of the actual CPU Runtime PM implementation I share Mark's concern about the duplication between cpu-idle-states in CPU nodes and power-states in power domain nodes. As I understand it, with this patchset the boundary between cpu-idle-states and power-states nodes translates to the boundary between states managed by cpuidle and states managed by Runtime PM. Mark hinted at a solution where idle state nodes refer to the power domains they affect. I guess this solution would look something like: CPU_0: cpu@0 { compatible =3D "example-cpu"; reg =3D <0x0 0x0>; device_type =3D "cpu"; enable-method =3D "psci"; cpu-idle-states =3D <&CPU_SLEEP &CLUSTER_SLEEP>; power-domains =3D <&CLUSTER_PD>; }; /* ... elsewhere ... */ idle-states { CPU_SLEEP: cpu-sleep { compatible =3D "arm,idle-state"; arm,psci-suspend-param =3D <0x0010000>; entry-latency-us =3D <100>; exit-latency-us =3D <250>; min-residency-us =3D <150>; }; CLUSTER_SLEEP: cluster-sleep { compatible =3D "arm,idle-state"; arm,psci-suspend-param =3D <0x1010000>; entry-latency-us =3D <800>; exit-latency-us =3D <700>; min-residency-us =3D <2500>; affects-power-domains =3D <&CLUSTER_PD> }; }; Then if desired/necessary, CLUSTER_SLEEP can be taken care of by Runtime PM (i.e. become a genpd_power_state) rather than cpuidle, the deciding factor being that it has an affects-power-domain property. (In this model, idle state nodes don't have a power-states property at all.) Mark, have I understood you correctly, is that what you were imagining? On 23/06/16 19:39, Lina Iyer wrote: > On Thu, Jun 23 2016 at 12:19 -0600, Mark Rutland wrote: >> On Thu, Jun 23, 2016 at 12:04:51PM -0600, Lina Iyer wrote: >>> Hi Mark, >>> >>> On Thu, Jun 23 2016 at 11:35 -0600, Mark Rutland wrote: >>> >Hi, >>> > >>> >On Wed, Jun 22, 2016 at 01:36:37PM -0600, Lina Iyer wrote: >>> >>From: Axel Haslam >>> >> >>> >>Update DT bindings to describe idle states of PM domains. >>> >> >>> >>Cc: >>> >>Signed-off-by: Marc Titinger >>> >>Signed-off-by: Lina Iyer >>> >>[Lina: Added state properties, removed state names, wakeup-latency, >>> >>added of_pm_genpd_init() API, pruned commit text] >>> >>Signed-off-by: Ulf Hansson >>> >>[Ulf: Moved around code to make it compile properly, rebased on top >>> of multiple state support] >>> >>--- >>> >> .../devicetree/bindings/power/power_domain.txt | 70 >>> ++++++++++++++++++++++ >>> >> 1 file changed, 70 insertions(+) >>> >> >>> >>diff --git >>> a/Documentation/devicetree/bindings/power/power_domain.txt >>> b/Documentation/devicetree/bindings/power/power_domain.txt >>> >>index 025b5e7..41e8dda 100644 >>> >>--- a/Documentation/devicetree/bindings/power/power_domain.txt >>> >>+++ b/Documentation/devicetree/bindings/power/power_domain.txt >>> >>@@ -29,6 +29,43 @@ Optional properties: >>> >> specified by this binding. More details about power domain >>> specifier are >>> >> available in the next section. >>> >> >>> >>+- power-states : A phandle of an idle-state that shall be soaked >>> into a >>> >>+ generic domain power state. >>> > >>> >It's somewhat unfortunate that this gives us two possible locations fo= r >>> >idle state lists (under the /cpus node and in a pm-domains node), >>> >especially as it's not clear what would happen were a DT to have both. >>> > >>> >I would prefer that we extend the existing bindings such that states >>> can >>> >refer to the power domains which they affect. >>> > >>> I agree. The CPU idle states have become defined to be specific to CPUs= . >>> PM Domain idle states are generic for any type of domain. I am hoping a= t >>> some point, we could converge and use the same idle state, but that >>> would mean changing the CPU idle states to make it generic. >> >> Outside of CPU idling, I don't fully understand how this will be used, >> so it's not clear to me what would need to be made generic. Apologies >> for my ignorance there. >> > There may be non-PSCI ARM v7 CPU domains that may have domain controller > drivers in the kernel. They would not hook into the ARM PSCI frameworks. > It is still cpuidle though. > Apologies also for my ignorance, but I don't think I understand you here. It sounds to me like the drivers you're describing are currently unable to get their idle states from DT - is that correct? If so, perhaps we can implement support using the CPU idle state nodes at first, then later make them generic so that these non-PSCI domains can use DT. >>> At some point, during my development, I did use the arm,idle-state for >>> domains as well, but the binding definitions were too restrictive for >>> a generic PM domain. >>> >>> I would be willing to make the change to CPU idle states to make it >>> generic and then we could just reference domain and CPU idle states >>> using the same bindings. Are we okay with that, specifically, >>> arm,psci-suspend-param? This binding is very restrictive in its >>> description. What we pass to the platform driver upon choosing a domain >>> state is very platform specific and therefore has to be generic in its >>> description. >> >> I was suggesting that for PSCI we should consistently us >> arm,psci-suspend-param, not that this should be used for all power >> domain state data. >> >> I imagine that mechanisms for powering down power domains will have >> varied requirements on data they require (and may require more than can >> be encoded in a u32), and I don't think it's best to try to force a >> single representation in the DT for that. It would be better to allow >> them to define the properties which they require. >> > The only way to do that is to push the DT parsing to the platform > drivers. In the case of CPU domains controlled by PSCI, we could use the > arm,idle-states but any other generic domain, may need to define their > own bindings and fill up the domain states before initiailizing the domai= n. > > While this approach pushes the onus on to the platform code, I am fine > with it. Is that what you were thinking too? [1] http://www.linaro.org/blog/core-dump/energy-aware-scheduling-eas-progress-u= pdate/ [2] https://lkml.org/lkml/2015/7/7/754 IMPORTANT NOTICE: The contents of this email and any attachments are confid= ential and may also be privileged. If you are not the intended recipient, p= lease notify the sender immediately and do not disclose the contents to any= other person, use it for any purpose, or store or copy the information in = any medium. Thank you. --------------030900010906090709090600 Content-Type: text/x-patch; name=0001-cpuidle-dt-Parse-states-from-power-domain-nodes.patch Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-cpuidle-dt-Parse-states-from-power-domain-nodes.patch" >>From 598a7acb19af3763aa17926168e24d8131f3a57d Mon Sep 17 00:00:00 2001 From: Brendan Jackman Date: Mon, 25 Jul 2016 14:26:08 +0100 Subject: [PATCH] cpuidle: dt: Parse states from power domain nodes This commit allows bindings similar to those in [1] to be used without requiring CPU PM domains. The cpuidle DT parsing code will first read idle states from the "cpu-idle-states" property, then walk up the power-domains tree (assuming that it is indeed a tree) for the CPU and read idle states from the "power-states" property. This is a temporary hack to enable a prototype EAS energy model in the device tree, for which we want to use CPU power-domains to describe energy costs in a topologically-aware way. [1] https://patchwork.kernel.org/patch/9193651/ --- drivers/cpuidle/dt_idle_states.c | 85 ++++++++++++++++++++++++++++++++++++= ++-- 1 file changed, 82 insertions(+), 3 deletions(-) diff --git a/drivers/cpuidle/dt_idle_states.c b/drivers/cpuidle/dt_idle_sta= tes.c index a5c111b..70fe30b 100644 --- a/drivers/cpuidle/dt_idle_states.c +++ b/drivers/cpuidle/dt_idle_states.c @@ -21,6 +21,8 @@ =20 #include "dt_idle_states.h" =20 +#define MAX_PD_DEPTH 8 + static int init_state_node(struct cpuidle_state *idle_state, =09=09=09 const struct of_device_id *matches, =09=09=09 struct device_node *state_node) @@ -92,6 +94,84 @@ static int init_state_node(struct cpuidle_state *idle_st= ate, } =20 /* + * First look in cpu-idle-states, then walk up the power-domains tree, ret= urning + * the idx'th idle state node encountered, or NULL if it can't be found. + * + * Returns the device_node pointer with refcount incremented. Use + * of_node_put() on it when done. + */ +static struct device_node *get_idle_state_node(struct device_node *cpu_nod= e, +=09=09=09=09=09 int idx) +{ +=09struct device_node *state_node, *consumer_node; +=09int num_cpu_idle_states, num_pd_idle_states; +=09int idx_pd; +=09unsigned int i; + +=09state_node =3D of_parse_phandle(cpu_node, "cpu-idle-states", idx); +=09if (state_node) +=09=09return state_node; + +=09num_cpu_idle_states =3D of_property_count_elems_of_size(cpu_node, +=09=09=09=09=09=09=09 "cpu-idle-states", +=09=09=09=09=09=09=09 4); +=09if (num_cpu_idle_states < 1) { +=09=09pr_devel("%s: no cpu-idle-states for %s\n", __func__, +=09=09=09 cpu_node->name); +=09=09num_cpu_idle_states =3D 0; +=09} + +=09pr_devel("%s: Looking in %s's power domains tree for state %d\n", +=09=09 __func__, cpu_node->full_name, idx); + +=09BUG_ON(idx < num_cpu_idle_states); +=09idx_pd =3D idx - num_cpu_idle_states; + +=09/* Use a loop counter so we don't infiniloop if there's a cycle in the +=09 * power-domains graph. */ +=09consumer_node =3D cpu_node; +=09for (i =3D 0; i < MAX_PD_DEPTH; i++) { +=09=09int num_pd_providers; +=09=09struct device_node *pd_node =3D of_parse_phandle( +=09=09=09consumer_node, "power-domains", 0); +=09=09if (!pd_node) { +=09=09=09return NULL; +=09=09} + +=09=09/* We're assuming the power-domains graph is a tree, complain if +=09=09 * not. */ +=09=09num_pd_providers =3D of_property_count_elems_of_size( +=09=09=09consumer_node, "power-domains", 4); +=09=09BUG_ON(num_pd_providers < 1); +=09=09if (num_pd_providers > 1) +=09=09=09pr_warn("power-domains graph for %s not a tree " +=09=09=09=09"(%s has multiple power-domains)\n", +=09=09=09=09cpu_node->full_name, consumer_node->full_name); + + +=09=09state_node =3D of_parse_phandle(pd_node, "power-states", idx_pd); +=09=09if (state_node) { +=09=09=09of_node_put(pd_node); +=09=09=09return state_node; +=09=09} + +=09=09num_pd_idle_states =3D of_property_count_elems_of_size( +=09=09=09pd_node, "power-states", 4); +=09=09if (num_pd_idle_states < 1) +=09=09=09num_pd_idle_states =3D 0; + +=09=09BUG_ON(idx_pd < num_pd_idle_states); +=09=09idx_pd -=3D num_pd_idle_states; + +=09=09of_node_put(pd_node); +=09=09consumer_node =3D pd_node; +=09} + +=09WARN(1, "DT CPU power domain graph too deep. Probably has a cycle."); +=09return NULL; +} + +/* * Check that the idle state is uniform across all CPUs in the CPUidle dri= ver * cpumask */ @@ -112,8 +192,7 @@ static bool idle_state_valid(struct device_node *state_= node, unsigned int idx, =09for (cpu =3D cpumask_next(cpumask_first(cpumask), cpumask); =09 cpu < nr_cpu_ids; cpu =3D cpumask_next(cpu, cpumask)) { =09=09cpu_node =3D of_cpu_device_node_get(cpu); -=09=09curr_state_node =3D of_parse_phandle(cpu_node, "cpu-idle-states", -=09=09=09=09=09=09 idx); +=09=09curr_state_node =3D get_idle_state_node(cpu_node, idx); =09=09if (state_node !=3D curr_state_node) =09=09=09valid =3D false; =20 @@ -170,7 +249,7 @@ int dt_init_idle_driver(struct cpuidle_driver *drv, =09cpu_node =3D of_cpu_device_node_get(cpumask_first(cpumask)); =20 =09for (i =3D 0; ; i++) { -=09=09state_node =3D of_parse_phandle(cpu_node, "cpu-idle-states", i); +=09=09state_node =3D get_idle_state_node(cpu_node,i); =09=09if (!state_node) =09=09=09break; =20 --=20 1.9.1 --------------030900010906090709090600--