* [PATCH 0/4] arm: remove cpu_efficiency @ 2017-08-30 14:41 Dietmar Eggemann 2017-08-30 14:41 ` [PATCH 1/4] arm: topology: " Dietmar Eggemann ` (3 more replies) 0 siblings, 4 replies; 17+ messages in thread From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw) To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Vincent Guittot, Juri Lelli For Cortex-A15/A7 arm big.LITTLE systems there are currently two ways to set the cpu capacity. The first one (commit 06073ee26775 "ARM: 8621/3: parse cpu capacity-dmips-mhz from DT") is based on dt 'cpu capacity-dmips-mhz' bindings and the appropriate dt parsing code in drivers/base/arch_topology.c. It further takes differences in maximum cpu frequency values into consideration, normalizes the maximum cpu capacity to SCHED_CAPACITY_SCALE (1024) and scales all the cpus accordingly. cpu capacity = (capacity-dmips-mhz * max cpu frequency) / (max capacity-dmips-mhz * max (max cpu frequency) This solution is shared between arm and arm64 and works for other combinations of big and little cpus (besides Cortex-A15/A7) as well. The second one (commit 339ca09d7ada "ARM: 7463/1: topology: Update cpu_power according to DT information" is based on the 'struct cpu_efficiency table_efficiency[]' and the dt parsing code in arch/arm/kernel/topology.c. It further requires a clock-frequency property per cpu node, calculates a so called middle frequency for an average cpu in the system which is as close as possible to SCHED_CAPACITY_SCALE (1024) and uses this to compute the cpu capacity values. cpu capacity = (cpu efficiency * clock frequency) / middle capacity This solution only works for Cortex-A15/A7 arm big.LITTLE systems. The aim of this patch-set is to have only one solution for all arm and arm64 big.LITTLE platforms. (1) Therefore, it removes the code for the 'cpu_efficiency/ clock-frequency dt property' (second) solution [patch 01/04] and migrates the arm big.LITTLE platforms currently using this approach [patch 02-04/04] to use the 'cpu capacity-dmips-mhz' (first) solution. (2) Moreover, it will also assure that the highest original cpu capacity (rq->cpu_capacity_orig) in a non-smt system is SCHED_CAPACITY_SCALE (1024). (3) And finally, another advantage is the dynamic detection of the max cpu frequency which comes with the first solution instead of the static clock-frequency dt property value. Currently, the arm dt parsing code in parse_dt_topology() checks if the dt uses the capacity-dmips-mhz property. If this is the case it uses the first, otherwise the second solution. This patch-set removes the code for the second solution from arch/arm/kernel/topology.c. The following arm big.LITTLE platforms which use cpu node descriptions with the 'compatible' properties "arm,cortex-a15" and "arm,cortex-a7" as well as the "clock-frequency" are (theoretically*) affected: (1) arndale-octa, peach-pi, peach-pit, smdk5420 (exynos5420-cpus.dtsi) (2) odroidxu3, odroidxu3-lite, odroidxu4 (exynos5422-cpus.dtsi) (3) r8a7790-lager (r8a7790.dtsi) TC2 (vexpress-v2p-ca15_a7.dts) already has the capacity-dmips-mhz properties (it never had "clock-frequency" properties per cpu node though). *Currently, these platforms are only theoretically affected. The reason is because heterogeneous cpu capacity support on arm stopped with commit 8cd5601c5060 ("sched/fair: Convert arch_scale_cpu_capacity() from weak function to #define") because the arch never defined arch_scale_cpu_capacity so the task scheduler uses the default implementation in kernel/sched/sched.h. This will change as soon the patch "arm: wire cpu-invariant accounting support up to the task scheduler" [1] is in mainline. This patch-set has been tested on TC2 and Samsung Chromebook 2 13" (peach-pi, Exynos 5800). [1] https://marc.info/?l=linux-kernel&m=150367158111303&w=2 Dietmar Eggemann (4): arm: topology: remove cpu_efficiency arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information arm: dts: exynos: add exynos5422 cpu capacity-dmips-mhz information arm: dts: r8a7790: add cpu capacity-dmips-mhz information arch/arm/boot/dts/exynos5420-cpus.dtsi | 8 +++ arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 +++ arch/arm/boot/dts/r8a7790.dtsi | 8 +++ arch/arm/kernel/topology.c | 113 +-------------------------------- 4 files changed, 27 insertions(+), 110 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/4] arm: topology: remove cpu_efficiency 2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann @ 2017-08-30 14:41 ` Dietmar Eggemann [not found] ` <20170830144120.9312-2-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> 2017-08-30 14:41 ` [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information Dietmar Eggemann ` (2 subsequent siblings) 3 siblings, 1 reply; 17+ messages in thread From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw) To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Vincent Guittot, Juri Lelli Remove the 'cpu_efficiency/clock-frequency dt property' based solution to set cpu capacity which was only working for Cortex-A15/A7 arm big.LITTLE systems. I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is shared between arm and arm64 and works for every big.LITTLE system no matter which core types it consists of. Cc: Russell King <linux@arm.linux.org.uk> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- arch/arm/kernel/topology.c | 113 ++------------------------------------------- 1 file changed, 3 insertions(+), 110 deletions(-) diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c index bf949a763dbe..04ccfdd94213 100644 --- a/arch/arm/kernel/topology.c +++ b/arch/arm/kernel/topology.c @@ -47,52 +47,10 @@ */ #ifdef CONFIG_OF -struct cpu_efficiency { - const char *compatible; - unsigned long efficiency; -}; - -/* - * Table of relative efficiency of each processors - * The efficiency value must fit in 20bit and the final - * cpu_scale value must be in the range - * 0 < cpu_scale < 3*SCHED_CAPACITY_SCALE/2 - * in order to return at most 1 when DIV_ROUND_CLOSEST - * is used to compute the capacity of a CPU. - * Processors that are not defined in the table, - * use the default SCHED_CAPACITY_SCALE value for cpu_scale. - */ -static const struct cpu_efficiency table_efficiency[] = { - {"arm,cortex-a15", 3891}, - {"arm,cortex-a7", 2048}, - {NULL, }, -}; - -static unsigned long *__cpu_capacity; -#define cpu_capacity(cpu) __cpu_capacity[cpu] - -static unsigned long middle_capacity = 1; -static bool cap_from_dt = true; - -/* - * Iterate all CPUs' descriptor in DT and compute the efficiency - * (as per table_efficiency). Also calculate a middle efficiency - * as close as possible to (max{eff_i} - min{eff_i}) / 2 - * This is later used to scale the cpu_capacity field such that an - * 'average' CPU is of middle capacity. Also see the comments near - * table_efficiency[] and update_cpu_capacity(). - */ static void __init parse_dt_topology(void) { - const struct cpu_efficiency *cpu_eff; - struct device_node *cn = NULL; - unsigned long min_capacity = ULONG_MAX; - unsigned long max_capacity = 0; - unsigned long capacity = 0; - int cpu = 0; - - __cpu_capacity = kcalloc(nr_cpu_ids, sizeof(*__cpu_capacity), - GFP_NOWAIT); + struct device_node *cn; + int cpu; cn = of_find_node_by_path("/cpus"); if (!cn) { @@ -101,9 +59,6 @@ static void __init parse_dt_topology(void) } for_each_possible_cpu(cpu) { - const u32 *rate; - int len; - /* too early to use cpu->of_node */ cn = of_get_cpu_node(cpu, NULL); if (!cn) { @@ -115,73 +70,13 @@ static void __init parse_dt_topology(void) of_node_put(cn); continue; } - - cap_from_dt = false; - - for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++) - if (of_device_is_compatible(cn, cpu_eff->compatible)) - break; - - if (cpu_eff->compatible == NULL) - continue; - - rate = of_get_property(cn, "clock-frequency", &len); - if (!rate || len != 4) { - pr_err("%s missing clock-frequency property\n", - cn->full_name); - continue; - } - - capacity = ((be32_to_cpup(rate)) >> 20) * cpu_eff->efficiency; - - /* Save min capacity of the system */ - if (capacity < min_capacity) - min_capacity = capacity; - - /* Save max capacity of the system */ - if (capacity > max_capacity) - max_capacity = capacity; - - cpu_capacity(cpu) = capacity; } - /* If min and max capacities are equals, we bypass the update of the - * cpu_scale because all CPUs have the same capacity. Otherwise, we - * compute a middle_capacity factor that will ensure that the capacity - * of an 'average' CPU of the system will be as close as possible to - * SCHED_CAPACITY_SCALE, which is the default value, but with the - * constraint explained near table_efficiency[]. - */ - if (4*max_capacity < (3*(max_capacity + min_capacity))) - middle_capacity = (min_capacity + max_capacity) - >> (SCHED_CAPACITY_SHIFT+1); - else - middle_capacity = ((max_capacity / 3) - >> (SCHED_CAPACITY_SHIFT-1)) + 1; - - if (cap_from_dt) - topology_normalize_cpu_scale(); -} - -/* - * Look for a customed capacity of a CPU in the cpu_capacity table during the - * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the - * function returns directly for SMP system. - */ -static void update_cpu_capacity(unsigned int cpu) -{ - if (!cpu_capacity(cpu) || cap_from_dt) - return; - - topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity); - - pr_info("CPU%u: update cpu_capacity %lu\n", - cpu, topology_get_cpu_scale(NULL, cpu)); + topology_normalize_cpu_scale(); } #else static inline void parse_dt_topology(void) {} -static inline void update_cpu_capacity(unsigned int cpuid) {} #endif /* @@ -277,8 +172,6 @@ void store_cpu_topology(unsigned int cpuid) update_siblings_masks(cpuid); - update_cpu_capacity(cpuid); - pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n", cpuid, cpu_topology[cpuid].thread_id, cpu_topology[cpuid].core_id, -- 2.11.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
[parent not found: <20170830144120.9312-2-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 1/4] arm: topology: remove cpu_efficiency [not found] ` <20170830144120.9312-2-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> @ 2017-09-04 7:49 ` Vincent Guittot 2017-09-06 11:43 ` Dietmar Eggemann 0 siblings, 1 reply; 17+ messages in thread From: Vincent Guittot @ 2017-09-04 7:49 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel, LAK, devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-samsung-soc, linux-renesas-soc-u79uwXL29TY76Z2rM5mHXA, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Juri Lelli Hi Dietmar, Removing cpu effificiency table looks good to me. Nevertheless, i have some comments below for this patch. On 30 August 2017 at 16:41, Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> wrote: > Remove the 'cpu_efficiency/clock-frequency dt property' based solution > to set cpu capacity which was only working for Cortex-A15/A7 arm > big.LITTLE systems. > > I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is > shared between arm and arm64 and works for every big.LITTLE system no > matter which core types it consists of. > > Cc: Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org> > Cc: Vincent Guittot <vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> > Cc: Juri Lelli <juri.lelli-5wv7dgnIgG8@public.gmane.org> > Signed-off-by: Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> > --- > arch/arm/kernel/topology.c | 113 ++------------------------------------------- > 1 file changed, 3 insertions(+), 110 deletions(-) > > diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c > index bf949a763dbe..04ccfdd94213 100644 > --- a/arch/arm/kernel/topology.c > +++ b/arch/arm/kernel/topology.c > @@ -47,52 +47,10 @@ > */ > > #ifdef CONFIG_OF > -struct cpu_efficiency { > - const char *compatible; > - unsigned long efficiency; > -}; > - > -/* > - * Table of relative efficiency of each processors > - * The efficiency value must fit in 20bit and the final > - * cpu_scale value must be in the range > - * 0 < cpu_scale < 3*SCHED_CAPACITY_SCALE/2 > - * in order to return at most 1 when DIV_ROUND_CLOSEST > - * is used to compute the capacity of a CPU. > - * Processors that are not defined in the table, > - * use the default SCHED_CAPACITY_SCALE value for cpu_scale. > - */ > -static const struct cpu_efficiency table_efficiency[] = { > - {"arm,cortex-a15", 3891}, > - {"arm,cortex-a7", 2048}, > - {NULL, }, > -}; > - > -static unsigned long *__cpu_capacity; > -#define cpu_capacity(cpu) __cpu_capacity[cpu] > - > -static unsigned long middle_capacity = 1; > -static bool cap_from_dt = true; > - > -/* > - * Iterate all CPUs' descriptor in DT and compute the efficiency > - * (as per table_efficiency). Also calculate a middle efficiency > - * as close as possible to (max{eff_i} - min{eff_i}) / 2 > - * This is later used to scale the cpu_capacity field such that an > - * 'average' CPU is of middle capacity. Also see the comments near > - * table_efficiency[] and update_cpu_capacity(). > - */ > static void __init parse_dt_topology(void) > { > - const struct cpu_efficiency *cpu_eff; > - struct device_node *cn = NULL; > - unsigned long min_capacity = ULONG_MAX; > - unsigned long max_capacity = 0; > - unsigned long capacity = 0; > - int cpu = 0; > - > - __cpu_capacity = kcalloc(nr_cpu_ids, sizeof(*__cpu_capacity), > - GFP_NOWAIT); > + struct device_node *cn; > + int cpu; > > cn = of_find_node_by_path("/cpus"); > if (!cn) { > @@ -101,9 +59,6 @@ static void __init parse_dt_topology(void) > } > > for_each_possible_cpu(cpu) { > - const u32 *rate; > - int len; > - > /* too early to use cpu->of_node */ > cn = of_get_cpu_node(cpu, NULL); > if (!cn) { > @@ -115,73 +70,13 @@ static void __init parse_dt_topology(void) > of_node_put(cn); > continue; AFAICT, this continue is now useless as it was there to skipe the cpu table efficiency method > } > - > - cap_from_dt = false; > - > - for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++) > - if (of_device_is_compatible(cn, cpu_eff->compatible)) > - break; > - > - if (cpu_eff->compatible == NULL) > - continue; > - > - rate = of_get_property(cn, "clock-frequency", &len); > - if (!rate || len != 4) { > - pr_err("%s missing clock-frequency property\n", > - cn->full_name); > - continue; > - } > - > - capacity = ((be32_to_cpup(rate)) >> 20) * cpu_eff->efficiency; > - > - /* Save min capacity of the system */ > - if (capacity < min_capacity) > - min_capacity = capacity; > - > - /* Save max capacity of the system */ > - if (capacity > max_capacity) > - max_capacity = capacity; > - > - cpu_capacity(cpu) = capacity; > } > > - /* If min and max capacities are equals, we bypass the update of the > - * cpu_scale because all CPUs have the same capacity. Otherwise, we > - * compute a middle_capacity factor that will ensure that the capacity > - * of an 'average' CPU of the system will be as close as possible to > - * SCHED_CAPACITY_SCALE, which is the default value, but with the > - * constraint explained near table_efficiency[]. > - */ > - if (4*max_capacity < (3*(max_capacity + min_capacity))) > - middle_capacity = (min_capacity + max_capacity) > - >> (SCHED_CAPACITY_SHIFT+1); > - else > - middle_capacity = ((max_capacity / 3) > - >> (SCHED_CAPACITY_SHIFT-1)) + 1; > - > - if (cap_from_dt) > - topology_normalize_cpu_scale(); Why have you moved the call to topology_normalize_cpu_scale() from parse_dt_topology() to update_cpu_capacity() ? You should keep it in parse_dt_topology() as itis part of the dt parsing sequence > -} > - > -/* > - * Look for a customed capacity of a CPU in the cpu_capacity table during the > - * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the > - * function returns directly for SMP system. > - */ > -static void update_cpu_capacity(unsigned int cpu) > -{ > - if (!cpu_capacity(cpu) || cap_from_dt) > - return; > - > - topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity); > - > - pr_info("CPU%u: update cpu_capacity %lu\n", > - cpu, topology_get_cpu_scale(NULL, cpu)); > + topology_normalize_cpu_scale(); > } You can probably just removed update_cpu_capacity() > > #else > static inline void parse_dt_topology(void) {} > -static inline void update_cpu_capacity(unsigned int cpuid) {} > #endif > > /* > @@ -277,8 +172,6 @@ void store_cpu_topology(unsigned int cpuid) > > update_siblings_masks(cpuid); > > - update_cpu_capacity(cpuid); > - > pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n", > cpuid, cpu_topology[cpuid].thread_id, > cpu_topology[cpuid].core_id, > -- > 2.11.0 > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/4] arm: topology: remove cpu_efficiency 2017-09-04 7:49 ` Vincent Guittot @ 2017-09-06 11:43 ` Dietmar Eggemann [not found] ` <303d3f7b-5d64-e13a-c4f9-dd575958cafa-5wv7dgnIgG8@public.gmane.org> 0 siblings, 1 reply; 17+ messages in thread From: Dietmar Eggemann @ 2017-09-06 11:43 UTC (permalink / raw) To: Vincent Guittot Cc: linux-kernel, LAK, devicetree@vger.kernel.org, linux-samsung-soc, linux-renesas-soc, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Juri Lelli Hi Vincent, On 04/09/17 08:49, Vincent Guittot wrote: > Hi Dietmar, > > Removing cpu effificiency table looks good to me. Nevertheless, i have > some comments below for this patch. Thanks for the review! > On 30 August 2017 at 16:41, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: >> Remove the 'cpu_efficiency/clock-frequency dt property' based solution >> to set cpu capacity which was only working for Cortex-A15/A7 arm >> big.LITTLE systems. >> >> I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is >> shared between arm and arm64 and works for every big.LITTLE system no >> matter which core types it consists of. >> >> Cc: Russell King <linux@arm.linux.org.uk> >> Cc: Vincent Guittot <vincent.guittot@linaro.org> >> Cc: Juri Lelli <juri.lelli@arm.com> >> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> >> --- >> arch/arm/kernel/topology.c | 113 ++------------------------------------------- >> 1 file changed, 3 insertions(+), 110 deletions(-) [...] >> @@ -115,73 +70,13 @@ static void __init parse_dt_topology(void) >> of_node_put(cn); >> continue; > > AFAICT, this continue is now useless as it was there to skipe the cpu > table efficiency method You're right ... will remove it. [...] >> - if (cap_from_dt) >> - topology_normalize_cpu_scale(); > > Why have you moved the call to topology_normalize_cpu_scale() from > parse_dt_topology() to update_cpu_capacity() ? Didn't move it ? It's still called from parse_dt_topology(). > You should keep it in parse_dt_topology() as itis part of the dt > parsing sequence Yes, this should be the case. [...] >> -/* >> - * Look for a customed capacity of a CPU in the cpu_capacity table during the >> - * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the >> - * function returns directly for SMP system. >> - */ >> -static void update_cpu_capacity(unsigned int cpu) >> -{ >> - if (!cpu_capacity(cpu) || cap_from_dt) >> - return; >> - >> - topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity); >> - >> - pr_info("CPU%u: update cpu_capacity %lu\n", >> - cpu, topology_get_cpu_scale(NULL, cpu)); >> + topology_normalize_cpu_scale(); >> } > > You can probably just removed update_cpu_capacity() I did remove update_cpu_capacity(). Maybe the patch layout is confusing? [...] ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <303d3f7b-5d64-e13a-c4f9-dd575958cafa-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 1/4] arm: topology: remove cpu_efficiency [not found] ` <303d3f7b-5d64-e13a-c4f9-dd575958cafa-5wv7dgnIgG8@public.gmane.org> @ 2017-09-06 12:40 ` Vincent Guittot 2017-09-07 10:41 ` Dietmar Eggemann 0 siblings, 1 reply; 17+ messages in thread From: Vincent Guittot @ 2017-09-06 12:40 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel, LAK, devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-samsung-soc, linux-renesas-soc-u79uwXL29TY76Z2rM5mHXA, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Juri Lelli On 6 September 2017 at 13:43, Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> wrote: > Hi Vincent, > > On 04/09/17 08:49, Vincent Guittot wrote: >> Hi Dietmar, >> >> Removing cpu effificiency table looks good to me. Nevertheless, i have >> some comments below for this patch. > > Thanks for the review! > >> On 30 August 2017 at 16:41, Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> wrote: >>> Remove the 'cpu_efficiency/clock-frequency dt property' based solution >>> to set cpu capacity which was only working for Cortex-A15/A7 arm >>> big.LITTLE systems. >>> >>> I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is >>> shared between arm and arm64 and works for every big.LITTLE system no >>> matter which core types it consists of. >>> >>> Cc: Russell King <linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org> >>> Cc: Vincent Guittot <vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> >>> Cc: Juri Lelli <juri.lelli-5wv7dgnIgG8@public.gmane.org> >>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> >>> --- >>> arch/arm/kernel/topology.c | 113 ++------------------------------------------- >>> 1 file changed, 3 insertions(+), 110 deletions(-) > > [...] > >>> @@ -115,73 +70,13 @@ static void __init parse_dt_topology(void) >>> of_node_put(cn); >>> continue; >> >> AFAICT, this continue is now useless as it was there to skipe the cpu >> table efficiency method > > You're right ... will remove it. > > [...] > >>> - if (cap_from_dt) >>> - topology_normalize_cpu_scale(); >> >> Why have you moved the call to topology_normalize_cpu_scale() from >> parse_dt_topology() to update_cpu_capacity() ? > > Didn't move it ? It's still called from parse_dt_topology(). > >> You should keep it in parse_dt_topology() as itis part of the dt >> parsing sequence > > Yes, this should be the case. > > [...] > >>> -/* >>> - * Look for a customed capacity of a CPU in the cpu_capacity table during the >>> - * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the >>> - * function returns directly for SMP system. >>> - */ >>> -static void update_cpu_capacity(unsigned int cpu) >>> -{ >>> - if (!cpu_capacity(cpu) || cap_from_dt) >>> - return; >>> - >>> - topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity); >>> - >>> - pr_info("CPU%u: update cpu_capacity %lu\n", >>> - cpu, topology_get_cpu_scale(NULL, cpu)); >>> + topology_normalize_cpu_scale(); >>> } >> >> You can probably just removed update_cpu_capacity() > > I did remove update_cpu_capacity(). Maybe the patch layout is confusing? yes you're right I have been confused by the layout > > [...] -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/4] arm: topology: remove cpu_efficiency 2017-09-06 12:40 ` Vincent Guittot @ 2017-09-07 10:41 ` Dietmar Eggemann 0 siblings, 0 replies; 17+ messages in thread From: Dietmar Eggemann @ 2017-09-07 10:41 UTC (permalink / raw) To: Vincent Guittot Cc: linux-kernel, LAK, devicetree@vger.kernel.org, linux-samsung-soc, linux-renesas-soc, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Juri Lelli On 06/09/17 13:40, Vincent Guittot wrote: > On 6 September 2017 at 13:43, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: >> Hi Vincent, >> >> On 04/09/17 08:49, Vincent Guittot wrote: >>> Hi Dietmar, >>> >>> Removing cpu effificiency table looks good to me. Nevertheless, i have >>> some comments below for this patch. >> >> Thanks for the review! >> >>> On 30 August 2017 at 16:41, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: [...] I fixed the issue with the continue statement. Moreover, I think we should also remove the comment block about 'cpu capacity scale management' and 'cpu capacity table' on top of parse_dt_topology() because this is now all handled in drivers/base/arch_topology.c. -- >8 -- From: Dietmar Eggemann <dietmar.eggemann@arm.com> Date: Sun, 9 Jul 2017 23:43:43 +0100 Subject: [PATCH] arm: topology: remove cpu_efficiency Remove the 'cpu_efficiency/clock-frequency dt property' based solution to set cpu capacity which was only working for Cortex-A15/A7 arm big.LITTLE systems. I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is shared between arm and arm64 and works for every big.LITTLE system no matter which core types it consists of. Cc: Russell King <linux@arm.linux.org.uk> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- arch/arm/kernel/topology.c | 130 ++------------------------------------------- 1 file changed, 3 insertions(+), 127 deletions(-) diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c index bf949a763dbe..5f46d290e34b 100644 --- a/arch/arm/kernel/topology.c +++ b/arch/arm/kernel/topology.c @@ -30,69 +30,11 @@ #include <asm/cputype.h> #include <asm/topology.h> -/* - * cpu capacity scale management - */ - -/* - * cpu capacity table - * This per cpu data structure describes the relative capacity of each core. - * On a heteregenous system, cores don't have the same computation capacity - * and we reflect that difference in the cpu_capacity field so the scheduler - * can take this difference into account during load balance. A per cpu - * structure is preferred because each CPU updates its own cpu_capacity field - * during the load balance except for idle cores. One idle core is selected - * to run the rebalance_domains for all idle cores and the cpu_capacity can be - * updated during this sequence. - */ - #ifdef CONFIG_OF -struct cpu_efficiency { - const char *compatible; - unsigned long efficiency; -}; - -/* - * Table of relative efficiency of each processors - * The efficiency value must fit in 20bit and the final - * cpu_scale value must be in the range - * 0 < cpu_scale < 3*SCHED_CAPACITY_SCALE/2 - * in order to return at most 1 when DIV_ROUND_CLOSEST - * is used to compute the capacity of a CPU. - * Processors that are not defined in the table, - * use the default SCHED_CAPACITY_SCALE value for cpu_scale. - */ -static const struct cpu_efficiency table_efficiency[] = { - {"arm,cortex-a15", 3891}, - {"arm,cortex-a7", 2048}, - {NULL, }, -}; - -static unsigned long *__cpu_capacity; -#define cpu_capacity(cpu) __cpu_capacity[cpu] - -static unsigned long middle_capacity = 1; -static bool cap_from_dt = true; - -/* - * Iterate all CPUs' descriptor in DT and compute the efficiency - * (as per table_efficiency). Also calculate a middle efficiency - * as close as possible to (max{eff_i} - min{eff_i}) / 2 - * This is later used to scale the cpu_capacity field such that an - * 'average' CPU is of middle capacity. Also see the comments near - * table_efficiency[] and update_cpu_capacity(). - */ static void __init parse_dt_topology(void) { - const struct cpu_efficiency *cpu_eff; - struct device_node *cn = NULL; - unsigned long min_capacity = ULONG_MAX; - unsigned long max_capacity = 0; - unsigned long capacity = 0; - int cpu = 0; - - __cpu_capacity = kcalloc(nr_cpu_ids, sizeof(*__cpu_capacity), - GFP_NOWAIT); + struct device_node *cn; + int cpu; cn = of_find_node_by_path("/cpus"); if (!cn) { @@ -101,9 +43,6 @@ static void __init parse_dt_topology(void) } for_each_possible_cpu(cpu) { - const u32 *rate; - int len; - /* too early to use cpu->of_node */ cn = of_get_cpu_node(cpu, NULL); if (!cn) { @@ -113,75 +52,14 @@ static void __init parse_dt_topology(void) if (topology_parse_cpu_capacity(cn, cpu)) { of_node_put(cn); - continue; } - - cap_from_dt = false; - - for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++) - if (of_device_is_compatible(cn, cpu_eff->compatible)) - break; - - if (cpu_eff->compatible == NULL) - continue; - - rate = of_get_property(cn, "clock-frequency", &len); - if (!rate || len != 4) { - pr_err("%s missing clock-frequency property\n", - cn->full_name); - continue; - } - - capacity = ((be32_to_cpup(rate)) >> 20) * cpu_eff->efficiency; - - /* Save min capacity of the system */ - if (capacity < min_capacity) - min_capacity = capacity; - - /* Save max capacity of the system */ - if (capacity > max_capacity) - max_capacity = capacity; - - cpu_capacity(cpu) = capacity; } - /* If min and max capacities are equals, we bypass the update of the - * cpu_scale because all CPUs have the same capacity. Otherwise, we - * compute a middle_capacity factor that will ensure that the capacity - * of an 'average' CPU of the system will be as close as possible to - * SCHED_CAPACITY_SCALE, which is the default value, but with the - * constraint explained near table_efficiency[]. - */ - if (4*max_capacity < (3*(max_capacity + min_capacity))) - middle_capacity = (min_capacity + max_capacity) - >> (SCHED_CAPACITY_SHIFT+1); - else - middle_capacity = ((max_capacity / 3) - >> (SCHED_CAPACITY_SHIFT-1)) + 1; - - if (cap_from_dt) - topology_normalize_cpu_scale(); -} - -/* - * Look for a customed capacity of a CPU in the cpu_capacity table during the - * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the - * function returns directly for SMP system. - */ -static void update_cpu_capacity(unsigned int cpu) -{ - if (!cpu_capacity(cpu) || cap_from_dt) - return; - - topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity); - - pr_info("CPU%u: update cpu_capacity %lu\n", - cpu, topology_get_cpu_scale(NULL, cpu)); + topology_normalize_cpu_scale(); } #else static inline void parse_dt_topology(void) {} -static inline void update_cpu_capacity(unsigned int cpuid) {} #endif /* @@ -277,8 +155,6 @@ void store_cpu_topology(unsigned int cpuid) update_siblings_masks(cpuid); - update_cpu_capacity(cpuid); - pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n", cpuid, cpu_topology[cpuid].thread_id, cpu_topology[cpuid].core_id, -- 2.11.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information 2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann 2017-08-30 14:41 ` [PATCH 1/4] arm: topology: " Dietmar Eggemann @ 2017-08-30 14:41 ` Dietmar Eggemann [not found] ` <20170830144120.9312-3-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> 2017-09-17 7:37 ` Krzysztof Kozlowski [not found] ` <20170830144120.9312-1-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> 2017-08-30 14:41 ` [PATCH 4/4] arm: dts: r8a7790: add " Dietmar Eggemann 3 siblings, 2 replies; 17+ messages in thread From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw) To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc Cc: Mark Rutland, Vincent Guittot, Juri Lelli, Russell King, Rob Herring, Kukjin Kim, Krzysztof Kozlowski The following 'capacity-dmips-mhz' dt property values are used: Cortex-A15: 1024, Cortex-A7: 539 They have been derived from the cpu_efficiency values: Cortex-A15: 3891, Cortex-A7: 2048 by scaling them so that the Cortex-A15s (big cores) use 1024. The cpu_efficiency values were originally derived from the "Big.LITTLE Processing with ARM Cortex™-A15 & Cortex-A7" white paper (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the Dhrystone benchmark. The following platforms are affected once cpu-invariant accounting support is re-connected to the task scheduler: arndale-octa, peach-pi, peach-pit, smdk5420 The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos 5800). $ cat /sys/devices/system/cpu/cpu*/cpu_capacity 1024 1024 1024 1024 389 389 389 389 The Cortex-A15 vs Cortex-A7 performance ratio is 1024/389 = 2.63. The values derived with the 'cpu_efficiency/clock-frequency dt property' solution are: $ cat /sys/devices/system/cpu/cpu*/cpu_capacity 1535 1535 1535 1535 448 448 448 448 The Cortex-A15 vs Cortex-A7 performance ratio is 1535/448 = 3.43. The discrepancy between 2.63 and 3.43 is due to the false assumption when using the 'cpu_efficiency/clock-frequency dt property' solution that the max cpu frequency of the little cpus is 1 GHZ and not 1.3 GHz. The Cortex-A7 cluster runs with a max cpu frequency of 1.3 GHZ whereas the 'clock-frequency' property value is set to 1 GHz. 3.43/1.3 = 2.64 $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq 1800000 1800000 1800000 1800000 1300000 <-- max cpu frequency of the Cortex-A7s (little cores) 1300000 1300000 1300000 Running another benchmark (single-threaded sysbench affine to the individual cpus) with performance cpufreq governor on the Samsung Chromebook 2 13" showed the following numbers: $ for i in `seq 0 7`; do taskset -c $i sysbench --test=cpu --num-threads=1 --max-time=10 run | grep "total number of events:"; done total number of events: 1083 total number of events: 1085 total number of events: 1085 total number of events: 1085 total number of events: 454 total number of events: 454 total number of events: 454 total number of events: 454 The Cortex-A15 vs Cortex-A7 performance ratio is 2.39, i.e. very close to the one derived from the Dhrystone based one of the "Big.LITTLE Processing with ARM Cortex™-A15 & Cortex-A7" white paper (2.63). We don't aim for exact values for the cpu capacity values. Besides the CPI (Cycles Per Instruction), the instruction mix and whether the system runs cpu-bound or memory-bound has an impact on the cpu capacity values derived from these benchmark results. Cc: Rob Herring <robh+dt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Kukjin Kim <kgene@kernel.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- arch/arm/boot/dts/exynos5420-cpus.dtsi | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm/boot/dts/exynos5420-cpus.dtsi b/arch/arm/boot/dts/exynos5420-cpus.dtsi index 5c052d7ff554..d7d703aa1699 100644 --- a/arch/arm/boot/dts/exynos5420-cpus.dtsi +++ b/arch/arm/boot/dts/exynos5420-cpus.dtsi @@ -36,6 +36,7 @@ cooling-min-level = <0>; cooling-max-level = <11>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <1024>; }; cpu1: cpu@1 { @@ -48,6 +49,7 @@ cooling-min-level = <0>; cooling-max-level = <11>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <1024>; }; cpu2: cpu@2 { @@ -60,6 +62,7 @@ cooling-min-level = <0>; cooling-max-level = <11>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <1024>; }; cpu3: cpu@3 { @@ -72,6 +75,7 @@ cooling-min-level = <0>; cooling-max-level = <11>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <1024>; }; cpu4: cpu@100 { @@ -85,6 +89,7 @@ cooling-min-level = <0>; cooling-max-level = <7>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <539>; }; cpu5: cpu@101 { @@ -97,6 +102,7 @@ cooling-min-level = <0>; cooling-max-level = <7>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <539>; }; cpu6: cpu@102 { @@ -109,6 +115,7 @@ cooling-min-level = <0>; cooling-max-level = <7>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <539>; }; cpu7: cpu@103 { @@ -121,6 +128,7 @@ cooling-min-level = <0>; cooling-max-level = <7>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <539>; }; }; }; -- 2.11.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 17+ messages in thread
[parent not found: <20170830144120.9312-3-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information [not found] ` <20170830144120.9312-3-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> @ 2017-08-30 20:26 ` Krzysztof Kozlowski 2017-08-31 10:36 ` Dietmar Eggemann 0 siblings, 1 reply; 17+ messages in thread From: Krzysztof Kozlowski @ 2017-08-30 20:26 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, linux-renesas-soc-u79uwXL29TY76Z2rM5mHXA, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Vincent Guittot, Juri Lelli On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote: > The following 'capacity-dmips-mhz' dt property values are used: > > Cortex-A15: 1024, Cortex-A7: 539 > > They have been derived from the cpu_efficiency values: > > Cortex-A15: 3891, Cortex-A7: 2048 > > by scaling them so that the Cortex-A15s (big cores) use 1024. > > The cpu_efficiency values were originally derived from the "Big.LITTLE > Processing with ARM Cortex™-A15 & Cortex-A7" white paper > (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x > (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the > Dhrystone benchmark. > > The following platforms are affected once cpu-invariant accounting > support is re-connected to the task scheduler: > > arndale-octa, peach-pi, peach-pit, smdk5420 > > The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos > 5800). > > $ cat /sys/devices/system/cpu/cpu*/cpu_capacity > 1024 > 1024 > 1024 > 1024 > 389 > 389 > 389 > 389 I am missing something... shouldn't this be 539? Or is it scaled with the clock-frequency (1 GHz) value? Best regards, Krzysztof > > The Cortex-A15 vs Cortex-A7 performance ratio is 1024/389 = 2.63. > > The values derived with the 'cpu_efficiency/clock-frequency dt property' > solution are: > > $ cat /sys/devices/system/cpu/cpu*/cpu_capacity > 1535 > 1535 > 1535 > 1535 > 448 > 448 > 448 > 448 > > The Cortex-A15 vs Cortex-A7 performance ratio is 1535/448 = 3.43. > > The discrepancy between 2.63 and 3.43 is due to the false assumption > when using the 'cpu_efficiency/clock-frequency dt property' solution > that the max cpu frequency of the little cpus is 1 GHZ and not 1.3 GHz. > The Cortex-A7 cluster runs with a max cpu frequency of 1.3 GHZ whereas > the 'clock-frequency' property value is set to 1 GHz. > > 3.43/1.3 = 2.64 > > $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq > 1800000 > 1800000 > 1800000 > 1800000 > 1300000 <-- max cpu frequency of the Cortex-A7s (little cores) > 1300000 > 1300000 > 1300000 > > Running another benchmark (single-threaded sysbench affine to the > individual cpus) with performance cpufreq governor on the Samsung > Chromebook 2 13" showed the following numbers: > > $ for i in `seq 0 7`; do taskset -c $i sysbench --test=cpu > --num-threads=1 --max-time=10 run | grep "total number of events:"; > done > > total number of events: 1083 > total number of events: 1085 > total number of events: 1085 > total number of events: 1085 > total number of events: 454 > total number of events: 454 > total number of events: 454 > total number of events: 454 > > The Cortex-A15 vs Cortex-A7 performance ratio is 2.39, i.e. very close > to the one derived from the Dhrystone based one of the "Big.LITTLE > Processing with ARM Cortex™-A15 & Cortex-A7" white paper (2.63). > > We don't aim for exact values for the cpu capacity values. Besides the > CPI (Cycles Per Instruction), the instruction mix and whether the system > runs cpu-bound or memory-bound has an impact on the cpu capacity values > derived from these benchmark results. > > Cc: Rob Herring <robh+dt-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> > Cc: Russell King <linux-I+IVW8TIWO2tmTQ+vhA3Yw@public.gmane.org> > Cc: Kukjin Kim <kgene-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Cc: Krzysztof Kozlowski <krzk-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > Signed-off-by: Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> > --- > arch/arm/boot/dts/exynos5420-cpus.dtsi | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/arch/arm/boot/dts/exynos5420-cpus.dtsi b/arch/arm/boot/dts/exynos5420-cpus.dtsi > index 5c052d7ff554..d7d703aa1699 100644 > --- a/arch/arm/boot/dts/exynos5420-cpus.dtsi > +++ b/arch/arm/boot/dts/exynos5420-cpus.dtsi > @@ -36,6 +36,7 @@ > cooling-min-level = <0>; > cooling-max-level = <11>; > #cooling-cells = <2>; /* min followed by max */ > + capacity-dmips-mhz = <1024>; > }; > > cpu1: cpu@1 { > @@ -48,6 +49,7 @@ > cooling-min-level = <0>; > cooling-max-level = <11>; > #cooling-cells = <2>; /* min followed by max */ > + capacity-dmips-mhz = <1024>; > }; > > cpu2: cpu@2 { > @@ -60,6 +62,7 @@ > cooling-min-level = <0>; > cooling-max-level = <11>; > #cooling-cells = <2>; /* min followed by max */ > + capacity-dmips-mhz = <1024>; > }; > > cpu3: cpu@3 { > @@ -72,6 +75,7 @@ > cooling-min-level = <0>; > cooling-max-level = <11>; > #cooling-cells = <2>; /* min followed by max */ > + capacity-dmips-mhz = <1024>; > }; > > cpu4: cpu@100 { > @@ -85,6 +89,7 @@ > cooling-min-level = <0>; > cooling-max-level = <7>; > #cooling-cells = <2>; /* min followed by max */ > + capacity-dmips-mhz = <539>; > }; > > cpu5: cpu@101 { > @@ -97,6 +102,7 @@ > cooling-min-level = <0>; > cooling-max-level = <7>; > #cooling-cells = <2>; /* min followed by max */ > + capacity-dmips-mhz = <539>; > }; > > cpu6: cpu@102 { > @@ -109,6 +115,7 @@ > cooling-min-level = <0>; > cooling-max-level = <7>; > #cooling-cells = <2>; /* min followed by max */ > + capacity-dmips-mhz = <539>; > }; > > cpu7: cpu@103 { > @@ -121,6 +128,7 @@ > cooling-min-level = <0>; > cooling-max-level = <7>; > #cooling-cells = <2>; /* min followed by max */ > + capacity-dmips-mhz = <539>; > }; > }; > }; > -- > 2.11.0 > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information 2017-08-30 20:26 ` Krzysztof Kozlowski @ 2017-08-31 10:36 ` Dietmar Eggemann 2017-09-03 19:56 ` Krzysztof Kozlowski 0 siblings, 1 reply; 17+ messages in thread From: Dietmar Eggemann @ 2017-08-31 10:36 UTC (permalink / raw) To: Krzysztof Kozlowski Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Vincent Guittot, Juri Lelli On 30/08/17 21:26, Krzysztof Kozlowski wrote: > On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote: >> The following 'capacity-dmips-mhz' dt property values are used: >> >> Cortex-A15: 1024, Cortex-A7: 539 >> >> They have been derived from the cpu_efficiency values: >> >> Cortex-A15: 3891, Cortex-A7: 2048 >> >> by scaling them so that the Cortex-A15s (big cores) use 1024. >> >> The cpu_efficiency values were originally derived from the "Big.LITTLE >> Processing with ARM Cortex™-A15 & Cortex-A7" white paper >> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x >> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the >> Dhrystone benchmark. >> >> The following platforms are affected once cpu-invariant accounting >> support is re-connected to the task scheduler: >> >> arndale-octa, peach-pi, peach-pit, smdk5420 >> >> The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos >> 5800). >> >> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity >> 1024 >> 1024 >> 1024 >> 1024 >> 389 >> 389 >> 389 >> 389 > > I am missing something... shouldn't this be 539? Or is it scaled with > the clock-frequency (1 GHz) value? Yeah, the capacity-dmips-mhz dt value of 539 for the little cpus is scaled by 1.3/1.8 (max cpu capacity/ system wide max cpu capacity): 539 * 1.3/1.8 = 389 This max cpu capacity scaling is part of both solutions, the 'cpu capacity-dmips-mhz' and the 'cpu_efficiency/clock-frequency dt property' one. The (original*) cpu capacity on a heterogeneous platform expresses uArch and max cpu frequency differences between the (logical) cpus of the system. * not further reduced by rt and/or irq pressure. [...] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information 2017-08-31 10:36 ` Dietmar Eggemann @ 2017-09-03 19:56 ` Krzysztof Kozlowski 2017-09-06 11:47 ` Dietmar Eggemann 0 siblings, 1 reply; 17+ messages in thread From: Krzysztof Kozlowski @ 2017-09-03 19:56 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Vincent Guittot, Juri Lelli On Thu, Aug 31, 2017 at 11:36:07AM +0100, Dietmar Eggemann wrote: > On 30/08/17 21:26, Krzysztof Kozlowski wrote: > > On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote: > >> The following 'capacity-dmips-mhz' dt property values are used: > >> > >> Cortex-A15: 1024, Cortex-A7: 539 > >> > >> They have been derived from the cpu_efficiency values: > >> > >> Cortex-A15: 3891, Cortex-A7: 2048 > >> > >> by scaling them so that the Cortex-A15s (big cores) use 1024. > >> > >> The cpu_efficiency values were originally derived from the "Big.LITTLE > >> Processing with ARM Cortex™-A15 & Cortex-A7" white paper > >> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x > >> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the > >> Dhrystone benchmark. > >> > >> The following platforms are affected once cpu-invariant accounting > >> support is re-connected to the task scheduler: > >> > >> arndale-octa, peach-pi, peach-pit, smdk5420 > >> > >> The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos > >> 5800). > >> > >> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity > >> 1024 > >> 1024 > >> 1024 > >> 1024 > >> 389 > >> 389 > >> 389 > >> 389 > > > > I am missing something... shouldn't this be 539? Or is it scaled with > > the clock-frequency (1 GHz) value? > > Yeah, the capacity-dmips-mhz dt value of 539 for the little cpus is > scaled by 1.3/1.8 (max cpu capacity/ system wide max cpu capacity): > > 539 * 1.3/1.8 = 389 > > This max cpu capacity scaling is part of both solutions, the 'cpu > capacity-dmips-mhz' and the 'cpu_efficiency/clock-frequency dt property' > one. > > The (original*) cpu capacity on a heterogeneous platform expresses uArch > and max cpu frequency differences between the (logical) cpus of the > system. > > * not further reduced by rt and/or irq pressure. > > [...] Thanks for explanation, looks fine for me. I'll take it after merge window. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information 2017-09-03 19:56 ` Krzysztof Kozlowski @ 2017-09-06 11:47 ` Dietmar Eggemann 0 siblings, 0 replies; 17+ messages in thread From: Dietmar Eggemann @ 2017-09-06 11:47 UTC (permalink / raw) To: Krzysztof Kozlowski Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, linux-renesas-soc-u79uwXL29TY76Z2rM5mHXA, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Vincent Guittot, Juri Lelli On 03/09/17 20:56, Krzysztof Kozlowski wrote: > On Thu, Aug 31, 2017 at 11:36:07AM +0100, Dietmar Eggemann wrote: >> On 30/08/17 21:26, Krzysztof Kozlowski wrote: >>> On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote: [...] >>>> The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos >>>> 5800). >>>> >>>> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity >>>> 1024 >>>> 1024 >>>> 1024 >>>> 1024 >>>> 389 >>>> 389 >>>> 389 >>>> 389 >>> >>> I am missing something... shouldn't this be 539? Or is it scaled with >>> the clock-frequency (1 GHz) value? >> >> Yeah, the capacity-dmips-mhz dt value of 539 for the little cpus is >> scaled by 1.3/1.8 (max cpu capacity/ system wide max cpu capacity): >> >> 539 * 1.3/1.8 = 389 >> >> This max cpu capacity scaling is part of both solutions, the 'cpu >> capacity-dmips-mhz' and the 'cpu_efficiency/clock-frequency dt property' >> one. >> >> The (original*) cpu capacity on a heterogeneous platform expresses uArch >> and max cpu frequency differences between the (logical) cpus of the >> system. >> >> * not further reduced by rt and/or irq pressure. >> >> [...] > > Thanks for explanation, looks fine for me. I'll take it after merge > window. Nice, since the 'cpu capacity-dmips-mhz' is already supported for arm (and used by TC2 (vexpress-v2p-ca15_a7.dts)) this can be done independently of the actual removal of the 'cpu_efficiency/clock-frequency dt property' solution in patch 1/4. [..] -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information 2017-08-30 14:41 ` [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information Dietmar Eggemann [not found] ` <20170830144120.9312-3-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> @ 2017-09-17 7:37 ` Krzysztof Kozlowski 1 sibling, 0 replies; 17+ messages in thread From: Krzysztof Kozlowski @ 2017-09-17 7:37 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Vincent Guittot, Juri Lelli On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote: > The following 'capacity-dmips-mhz' dt property values are used: > > Cortex-A15: 1024, Cortex-A7: 539 > > They have been derived from the cpu_efficiency values: > > Cortex-A15: 3891, Cortex-A7: 2048 > > by scaling them so that the Cortex-A15s (big cores) use 1024. > > The cpu_efficiency values were originally derived from the "Big.LITTLE > Processing with ARM Cortex™-A15 & Cortex-A7" white paper > (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x > (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the > Dhrystone benchmark. > > The following platforms are affected once cpu-invariant accounting > support is re-connected to the task scheduler: > > arndale-octa, peach-pi, peach-pit, smdk5420 > > The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos > 5800). > > $ cat /sys/devices/system/cpu/cpu*/cpu_capacity > 1024 > 1024 > 1024 > 1024 > 389 > 389 > 389 > 389 > > The Cortex-A15 vs Cortex-A7 performance ratio is 1024/389 = 2.63. > > The values derived with the 'cpu_efficiency/clock-frequency dt property' > solution are: > > $ cat /sys/devices/system/cpu/cpu*/cpu_capacity > 1535 > 1535 > 1535 > 1535 > 448 > 448 > 448 > 448 > > The Cortex-A15 vs Cortex-A7 performance ratio is 1535/448 = 3.43. > > The discrepancy between 2.63 and 3.43 is due to the false assumption > when using the 'cpu_efficiency/clock-frequency dt property' solution > that the max cpu frequency of the little cpus is 1 GHZ and not 1.3 GHz. > The Cortex-A7 cluster runs with a max cpu frequency of 1.3 GHZ whereas > the 'clock-frequency' property value is set to 1 GHz. > > 3.43/1.3 = 2.64 > > $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq > 1800000 > 1800000 > 1800000 > 1800000 > 1300000 <-- max cpu frequency of the Cortex-A7s (little cores) > 1300000 > 1300000 > 1300000 > > Running another benchmark (single-threaded sysbench affine to the > individual cpus) with performance cpufreq governor on the Samsung > Chromebook 2 13" showed the following numbers: > > $ for i in `seq 0 7`; do taskset -c $i sysbench --test=cpu > --num-threads=1 --max-time=10 run | grep "total number of events:"; > done > > total number of events: 1083 > total number of events: 1085 > total number of events: 1085 > total number of events: 1085 > total number of events: 454 > total number of events: 454 > total number of events: 454 > total number of events: 454 > > The Cortex-A15 vs Cortex-A7 performance ratio is 2.39, i.e. very close > to the one derived from the Dhrystone based one of the "Big.LITTLE > Processing with ARM Cortex™-A15 & Cortex-A7" white paper (2.63). > > We don't aim for exact values for the cpu capacity values. Besides the > CPI (Cycles Per Instruction), the instruction mix and whether the system > runs cpu-bound or memory-bound has an impact on the cpu capacity values > derived from these benchmark results. > > Cc: Rob Herring <robh+dt@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Russell King <linux@armlinux.org.uk> > Cc: Kukjin Kim <kgene@kernel.org> > Cc: Krzysztof Kozlowski <krzk@kernel.org> > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > arch/arm/boot/dts/exynos5420-cpus.dtsi | 8 ++++++++ > 1 file changed, 8 insertions(+) > Thanks, applied (with s/arm/ARM/ change in subject). Best regards, Krzysztof ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <20170830144120.9312-1-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org>]
* [PATCH 3/4] arm: dts: exynos: add exynos5422 cpu capacity-dmips-mhz information [not found] ` <20170830144120.9312-1-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> @ 2017-08-30 14:41 ` Dietmar Eggemann 2017-09-17 7:37 ` Krzysztof Kozlowski 0 siblings, 1 reply; 17+ messages in thread From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw) To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA, linux-renesas-soc-u79uwXL29TY76Z2rM5mHXA Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Vincent Guittot, Juri Lelli The following 'capacity-dmips-mhz' dt property values are used: Cortex-A15: 1024, Cortex-A7: 539 They have been derived form the cpu_efficiency values: Cortex-A15: 3891, Cortex-A7: 2048 by scaling them so that the Cortex-A15s (big cores) use 1024. The cpu_efficiency values were originally derived from the "Big.LITTLE Processing with ARM Cortex™-A15 & Cortex-A7" white paper (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the Dhrystone benchmark. The following platforms are affected once cpu-invariant accounting support is re-connected to the task scheduler: odroidxu3, odroidxu3-lite, odroidxu4 Cc: Rob Herring <robh+dt-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org> Cc: Russell King <linux-I+IVW8TIWO2tmTQ+vhA3Yw@public.gmane.org> Cc: Kukjin Kim <kgene-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Cc: Krzysztof Kozlowski <krzk-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> Signed-off-by: Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> --- arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm/boot/dts/exynos5422-cpus.dtsi b/arch/arm/boot/dts/exynos5422-cpus.dtsi index bf3c6f1ec4ee..ec01d8020c2d 100644 --- a/arch/arm/boot/dts/exynos5422-cpus.dtsi +++ b/arch/arm/boot/dts/exynos5422-cpus.dtsi @@ -35,6 +35,7 @@ cooling-min-level = <0>; cooling-max-level = <11>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <539>; }; cpu1: cpu@101 { @@ -47,6 +48,7 @@ cooling-min-level = <0>; cooling-max-level = <11>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <539>; }; cpu2: cpu@102 { @@ -59,6 +61,7 @@ cooling-min-level = <0>; cooling-max-level = <11>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <539>; }; cpu3: cpu@103 { @@ -71,6 +74,7 @@ cooling-min-level = <0>; cooling-max-level = <11>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <539>; }; cpu4: cpu@0 { @@ -84,6 +88,7 @@ cooling-min-level = <0>; cooling-max-level = <15>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <1024>; }; cpu5: cpu@1 { @@ -96,6 +101,7 @@ cooling-min-level = <0>; cooling-max-level = <15>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <1024>; }; cpu6: cpu@2 { @@ -108,6 +114,7 @@ cooling-min-level = <0>; cooling-max-level = <15>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <1024>; }; cpu7: cpu@3 { @@ -120,6 +127,7 @@ cooling-min-level = <0>; cooling-max-level = <15>; #cooling-cells = <2>; /* min followed by max */ + capacity-dmips-mhz = <1024>; }; }; }; -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] arm: dts: exynos: add exynos5422 cpu capacity-dmips-mhz information 2017-08-30 14:41 ` [PATCH 3/4] arm: dts: exynos: add exynos5422 " Dietmar Eggemann @ 2017-09-17 7:37 ` Krzysztof Kozlowski 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Kozlowski @ 2017-09-17 7:37 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Vincent Guittot, Juri Lelli On Wed, Aug 30, 2017 at 03:41:19PM +0100, Dietmar Eggemann wrote: > The following 'capacity-dmips-mhz' dt property values are used: > > Cortex-A15: 1024, Cortex-A7: 539 > > They have been derived form the cpu_efficiency values: > > Cortex-A15: 3891, Cortex-A7: 2048 > > by scaling them so that the Cortex-A15s (big cores) use 1024. > > The cpu_efficiency values were originally derived from the "Big.LITTLE > Processing with ARM Cortex™-A15 & Cortex-A7" white paper > (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x > (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the > Dhrystone benchmark. > > The following platforms are affected once cpu-invariant accounting > support is re-connected to the task scheduler: > > odroidxu3, odroidxu3-lite, odroidxu4 > > Cc: Rob Herring <robh+dt@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Russell King <linux@armlinux.org.uk> > Cc: Kukjin Kim <kgene@kernel.org> > Cc: Krzysztof Kozlowski <krzk@kernel.org> > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> > --- > arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++ > 1 file changed, 8 insertions(+) > Thanks, applied. Best regards, Krzysztof ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 4/4] arm: dts: r8a7790: add cpu capacity-dmips-mhz information 2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann ` (2 preceding siblings ...) [not found] ` <20170830144120.9312-1-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> @ 2017-08-30 14:41 ` Dietmar Eggemann 2017-09-18 7:39 ` Simon Horman 3 siblings, 1 reply; 17+ messages in thread From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw) To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Vincent Guittot, Juri Lelli The following 'capacity-dmips-mhz' dt property values are used: Cortex-A15: 1024, Cortex-A7: 539 They have been derived form the cpu_efficiency values: Cortex-A15: 3891, Cortex-A7: 2048 by scaling them so that the Cortex-A15s (big cores) use 1024. The cpu_efficiency values were originally derived from the "Big.LITTLE Processing with ARM Cortex™-A15 & Cortex-A7" white paper (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the Dhrystone benchmark. The following platform is affected once cpu-invariant accounting support is re-connected to the task scheduler: r8a7790-lager Cc: Rob Herring <robh+dt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> --- arch/arm/boot/dts/r8a7790.dtsi | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm/boot/dts/r8a7790.dtsi b/arch/arm/boot/dts/r8a7790.dtsi index 2805a8608d4b..a57c0e170d8b 100644 --- a/arch/arm/boot/dts/r8a7790.dtsi +++ b/arch/arm/boot/dts/r8a7790.dtsi @@ -56,6 +56,7 @@ clock-latency = <300000>; /* 300 us */ power-domains = <&sysc R8A7790_PD_CA15_CPU0>; next-level-cache = <&L2_CA15>; + capacity-dmips-mhz = <1024>; /* kHz - uV - OPPs unknown yet */ operating-points = <1400000 1000000>, @@ -73,6 +74,7 @@ clock-frequency = <1300000000>; power-domains = <&sysc R8A7790_PD_CA15_CPU1>; next-level-cache = <&L2_CA15>; + capacity-dmips-mhz = <1024>; }; cpu2: cpu@2 { @@ -82,6 +84,7 @@ clock-frequency = <1300000000>; power-domains = <&sysc R8A7790_PD_CA15_CPU2>; next-level-cache = <&L2_CA15>; + capacity-dmips-mhz = <1024>; }; cpu3: cpu@3 { @@ -91,6 +94,7 @@ clock-frequency = <1300000000>; power-domains = <&sysc R8A7790_PD_CA15_CPU3>; next-level-cache = <&L2_CA15>; + capacity-dmips-mhz = <1024>; }; cpu4: cpu@100 { @@ -100,6 +104,7 @@ clock-frequency = <780000000>; power-domains = <&sysc R8A7790_PD_CA7_CPU0>; next-level-cache = <&L2_CA7>; + capacity-dmips-mhz = <539>; }; cpu5: cpu@101 { @@ -109,6 +114,7 @@ clock-frequency = <780000000>; power-domains = <&sysc R8A7790_PD_CA7_CPU1>; next-level-cache = <&L2_CA7>; + capacity-dmips-mhz = <539>; }; cpu6: cpu@102 { @@ -118,6 +124,7 @@ clock-frequency = <780000000>; power-domains = <&sysc R8A7790_PD_CA7_CPU2>; next-level-cache = <&L2_CA7>; + capacity-dmips-mhz = <539>; }; cpu7: cpu@103 { @@ -127,6 +134,7 @@ clock-frequency = <780000000>; power-domains = <&sysc R8A7790_PD_CA7_CPU3>; next-level-cache = <&L2_CA7>; + capacity-dmips-mhz = <539>; }; L2_CA15: cache-controller-0 { -- 2.11.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] arm: dts: r8a7790: add cpu capacity-dmips-mhz information 2017-08-30 14:41 ` [PATCH 4/4] arm: dts: r8a7790: add " Dietmar Eggemann @ 2017-09-18 7:39 ` Simon Horman 2017-10-09 17:55 ` Dietmar Eggemann 0 siblings, 1 reply; 17+ messages in thread From: Simon Horman @ 2017-09-18 7:39 UTC (permalink / raw) To: Dietmar Eggemann Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Vincent Guittot, Juri Lelli On Wed, Aug 30, 2017 at 03:41:20PM +0100, Dietmar Eggemann wrote: > The following 'capacity-dmips-mhz' dt property values are used: > > Cortex-A15: 1024, Cortex-A7: 539 > > They have been derived form the cpu_efficiency values: > > Cortex-A15: 3891, Cortex-A7: 2048 > > by scaling them so that the Cortex-A15s (big cores) use 1024. > > The cpu_efficiency values were originally derived from the "Big.LITTLE > Processing with ARM Cortex™-A15 & Cortex-A7" white paper > (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x > (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the > Dhrystone benchmark. > > The following platform is affected once cpu-invariant accounting > support is re-connected to the task scheduler: Thanks, applied for v4.15. My understanding from the following comment in the cover letter is that not currently the case and this there is no behavioural change in applying this patch. For the record I observed the following with and without this patch applied. I believe this is the expected result. v4.14-rc1 # cat /sys/devices/system/cpu/cpu*/cpu_capacity 1535 1535 1535 1535 1024 1024 1024 1024 v4.14-rc1 + patch # cat /sys/devices/system/cpu/cpu*/cpu_capacity 1024 1024 1024 1024 539 539 539 539 ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] arm: dts: r8a7790: add cpu capacity-dmips-mhz information 2017-09-18 7:39 ` Simon Horman @ 2017-10-09 17:55 ` Dietmar Eggemann 0 siblings, 0 replies; 17+ messages in thread From: Dietmar Eggemann @ 2017-10-09 17:55 UTC (permalink / raw) To: Simon Horman Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc, linux-renesas-soc, Russell King, Rob Herring, Mark Rutland, Kukjin Kim, Krzysztof Kozlowski, Vincent Guittot, Juri Lelli On 18/09/17 08:39, Simon Horman wrote: > On Wed, Aug 30, 2017 at 03:41:20PM +0100, Dietmar Eggemann wrote: >> The following 'capacity-dmips-mhz' dt property values are used: >> >> Cortex-A15: 1024, Cortex-A7: 539 >> >> They have been derived form the cpu_efficiency values: >> >> Cortex-A15: 3891, Cortex-A7: 2048 >> >> by scaling them so that the Cortex-A15s (big cores) use 1024. >> >> The cpu_efficiency values were originally derived from the "Big.LITTLE >> Processing with ARM Cortex™-A15 & Cortex-A7" white paper >> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x >> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the >> Dhrystone benchmark. >> >> The following platform is affected once cpu-invariant accounting >> support is re-connected to the task scheduler: > > Thanks, applied for v4.15. > > My understanding from the following comment in the cover letter is that not > currently the case and this there is no behavioural change in applying this > patch. > > For the record I observed the following with and without this patch > applied. I believe this is the expected result. > > v4.14-rc1 > # cat /sys/devices/system/cpu/cpu*/cpu_capacity > 1535 > 1535 > 1535 > 1535 > 1024 > 1024 > 1024 > 1024 > > v4.14-rc1 + patch > # cat /sys/devices/system/cpu/cpu*/cpu_capacity > 1024 > 1024 > 1024 > 1024 > 539 > 539 > 539 > 539 Thanks Simon! Yes, that is the expected behaviour. And sorry for not responding earlier! With exynos542{0,2} and r8a7790 switching to the 'capacity-dmips-mhz' based solution in v4.15, I can push for removal of the cpu_efficency code [patch 1/4]. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-10-09 17:55 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann 2017-08-30 14:41 ` [PATCH 1/4] arm: topology: " Dietmar Eggemann [not found] ` <20170830144120.9312-2-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> 2017-09-04 7:49 ` Vincent Guittot 2017-09-06 11:43 ` Dietmar Eggemann [not found] ` <303d3f7b-5d64-e13a-c4f9-dd575958cafa-5wv7dgnIgG8@public.gmane.org> 2017-09-06 12:40 ` Vincent Guittot 2017-09-07 10:41 ` Dietmar Eggemann 2017-08-30 14:41 ` [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information Dietmar Eggemann [not found] ` <20170830144120.9312-3-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> 2017-08-30 20:26 ` Krzysztof Kozlowski 2017-08-31 10:36 ` Dietmar Eggemann 2017-09-03 19:56 ` Krzysztof Kozlowski 2017-09-06 11:47 ` Dietmar Eggemann 2017-09-17 7:37 ` Krzysztof Kozlowski [not found] ` <20170830144120.9312-1-dietmar.eggemann-5wv7dgnIgG8@public.gmane.org> 2017-08-30 14:41 ` [PATCH 3/4] arm: dts: exynos: add exynos5422 " Dietmar Eggemann 2017-09-17 7:37 ` Krzysztof Kozlowski 2017-08-30 14:41 ` [PATCH 4/4] arm: dts: r8a7790: add " Dietmar Eggemann 2017-09-18 7:39 ` Simon Horman 2017-10-09 17:55 ` Dietmar Eggemann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).