From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Lezcano Subject: Re: [PATCH 10/10] cpuidle: cpuidle driver for MIPS CPS Date: Wed, 26 Feb 2014 15:37:39 +0100 Message-ID: <530DFC33.1090609@linaro.org> References: <1389794137-11361-1-git-send-email-paul.burton@imgtec.com> <1389794137-11361-11-git-send-email-paul.burton@imgtec.com> <530CB7DA.7060305@linaro.org> <20140225221200.GP25765@pburton-linux.le.imgtec.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from mail-wi0-f179.google.com ([209.85.212.179]:45141 "EHLO mail-wi0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750891AbaBZOhj (ORCPT ); Wed, 26 Feb 2014 09:37:39 -0500 Received: by mail-wi0-f179.google.com with SMTP id bs8so2206231wib.12 for ; Wed, 26 Feb 2014 06:37:37 -0800 (PST) In-Reply-To: <20140225221200.GP25765@pburton-linux.le.imgtec.org> Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: Paul Burton Cc: linux-mips@linux-mips.org, "Rafael J. Wysocki" , linux-pm@vger.kernel.org On 02/25/2014 11:12 PM, Paul Burton wrote: > On Tue, Feb 25, 2014 at 04:33:46PM +0100, Daniel Lezcano wrote: >> On 01/15/2014 02:55 PM, Paul Burton wrote: >>> This patch introduces a cpuidle driver implementation for the MIPS >>> Coherent Processing System (ie. Coherence Manager, Cluster Power >>> Controller). It allows for use of the following idle states: >>> >>> - Coherent wait. This is the usual MIPS wait instruction. >>> >>> - Non-coherent wait. In this state a core will disable coherency = with >>> the rest of the system before running the wait instruction. Thi= s >>> eliminates coherence interventions which would typically be use= d to >>> keep cores coherent. >>> >>> These two states lay the groundwork for deeper states to be impleme= nted >>> later, since all deeper states require the core to become non-coher= ent. >>> >>> Signed-off-by: Paul Burton >>> Cc: Rafael J. Wysocki >>> Cc: Daniel Lezcano >>> Cc: linux-pm@vger.kernel.org >>> --- >>> drivers/cpuidle/Kconfig | 5 + >>> drivers/cpuidle/Kconfig.mips | 14 + >>> drivers/cpuidle/Makefile | 3 + >>> drivers/cpuidle/cpuidle-mips-cps.c | 545 ++++++++++++++++++++++++= +++++++++++++ >>> 4 files changed, 567 insertions(+) >>> create mode 100644 drivers/cpuidle/Kconfig.mips >>> create mode 100644 drivers/cpuidle/cpuidle-mips-cps.c >>> >>> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig >>> index b3fb81d..11ff281 100644 >>> --- a/drivers/cpuidle/Kconfig >>> +++ b/drivers/cpuidle/Kconfig >>> @@ -35,6 +35,11 @@ depends on ARM >>> source "drivers/cpuidle/Kconfig.arm" >>> endmenu >>> >>> +menu "MIPS CPU Idle Drivers" >>> +depends on MIPS >>> +source "drivers/cpuidle/Kconfig.mips" >>> +endmenu >>> + >>> endif >>> >>> config ARCH_NEEDS_CPU_IDLE_COUPLED >>> diff --git a/drivers/cpuidle/Kconfig.mips b/drivers/cpuidle/Kconfig= =2Emips >>> new file mode 100644 >>> index 0000000..dc96691 >>> --- /dev/null >>> +++ b/drivers/cpuidle/Kconfig.mips >>> @@ -0,0 +1,14 @@ >>> +# >>> +# MIPS CPU Idle drivers >>> +# >>> + >>> +config MIPS_CPS_CPUIDLE >>> + bool "Support for MIPS Coherent Processing Systems" >>> + depends on SYS_SUPPORTS_MIPS_CPS && CPU_MIPSR2 && !MIPS_MT_SMTC >>> + select ARCH_NEEDS_CPU_IDLE_COUPLED if MIPS_MT >>> + select MIPS_CM >>> + help >>> + Select this option to enable CPU idle driver for systems based >>> + around the MIPS Coherent Processing System architecture - that >>> + is, those with a Coherence Manager & optionally a Cluster >>> + Power Controller. >>> diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile >>> index 527be28..693cd95 100644 >>> --- a/drivers/cpuidle/Makefile >>> +++ b/drivers/cpuidle/Makefile >>> @@ -13,3 +13,6 @@ obj-$(CONFIG_ARM_KIRKWOOD_CPUIDLE) +=3D cpuidle-k= irkwood.o >>> obj-$(CONFIG_ARM_ZYNQ_CPUIDLE) +=3D cpuidle-zynq.o >>> obj-$(CONFIG_ARM_U8500_CPUIDLE) +=3D cpuidle-ux500.o >>> obj-$(CONFIG_ARM_AT91_CPUIDLE) +=3D cpuidle-at91.o >>> + >>> +# MIPS SoC drivers >>> +obj-$(CONFIG_MIPS_CPS_CPUIDLE) +=3D cpuidle-mips-cps.o >>> diff --git a/drivers/cpuidle/cpuidle-mips-cps.c b/drivers/cpuidle/c= puidle-mips-cps.c >>> new file mode 100644 >>> index 0000000..a78bfb4 >>> --- /dev/null >>> +++ b/drivers/cpuidle/cpuidle-mips-cps.c >>> @@ -0,0 +1,545 @@ >>> +/* >>> + * Copyright (C) 2013 Imagination Technologies >>> + * Author: Paul Burton >>> + * >>> + * This program is free software; you can redistribute it and/or m= odify it >>> + * under the terms of the GNU General Public License as published = by the >>> + * Free Software Foundation; either version 2 of the License, or= (at your >>> + * option) any later version. >>> + */ >>> + >>> +#include >>> +#include >>> +#include >>> +#include >>> + >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >> >> The convention is to not include headers from arch. These headers sh= ouldn't >> appear in this driver. >> > > Without accessing those headers I can't really implement anything use= ful > - entering these idle states by their nature involves architecture > specifics. Would you rather the bulk of the driver is implemented und= er > arch/mips & the code in drivers/cpuidle simply calls elsewhere to do = the > real work? Usually, this is how it is done. If you can move the 'asm' code inside=20 eg. pm.c somewhere in arch/mips and invoke from pm functions from=20 cpuidle through a simple header that would be great. >>> +/* >>> + * The CM & CPC can only handle coherence & power control on a per= -core basis, >>> + * thus in an MT system the VPEs within each core are coupled and = can only >>> + * enter or exit states requiring CM or CPC assistance in unison. >>> + */ >>> +#ifdef CONFIG_MIPS_MT >>> +# define coupled_coherence cpu_has_mipsmt >>> +#else >>> +# define coupled_coherence 0 >>> +#endif >>> + >>> +/* >>> + * cps_nc_entry_fn - type of a generated non-coherent state entry = function >>> + * @vpe_mask: a bitmap of online coupled VPEs, excluding this one >>> + * @online: the count of online coupled VPEs (weight of vpe_mask += 1) >>> + * >>> + * The code entering & exiting non-coherent states is generated at= runtime >>> + * using uasm, in order to ensure that the compiler cannot insert = a stray >>> + * memory access at an unfortunate time and to allow the generatio= n of optimal >>> + * core-specific code particularly for cache routines. If coupled_= coherence >>> + * is non-zero, returns the number of VPEs that were in the wait s= tate at the >>> + * point this VPE left it. Returns garbage if coupled_coherence is= zero. >>> + */ >>> +typedef unsigned (*cps_nc_entry_fn)(unsigned vpe_mask, unsigned on= line); >>> + >>> +/* >>> + * The entry point of the generated non-coherent wait entry/exit f= unction. >>> + * Actually per-core rather than per-CPU. >>> + */ >>> +static DEFINE_PER_CPU_READ_MOSTLY(cps_nc_entry_fn, ncwait_asm_ente= r); >>> + >>> +/* >>> + * Indicates the number of coupled VPEs ready to operate in a non-= coherent >>> + * state. Actually per-core rather than per-CPU. >>> + */ >>> +static DEFINE_PER_CPU_ALIGNED(u32, nc_ready_count); >>> + >>> +/* A somewhat arbitrary number of labels & relocs for uasm */ >>> +static struct uasm_label labels[32] __initdata; >>> +static struct uasm_reloc relocs[32] __initdata; >>> + >>> +/* CPU dependant sync types */ >>> +static unsigned stype_intervention; >>> +static unsigned stype_memory; >>> +static unsigned stype_ordering; >>> + >>> +enum mips_reg { >>> + zero, at, v0, v1, a0, a1, a2, a3, >>> + t0, t1, t2, t3, t4, t5, t6, t7, >>> + s0, s1, s2, s3, s4, s5, s6, s7, >>> + t8, t9, k0, k1, gp, sp, fp, ra, >>> +}; >>> + >>> +static int cps_ncwait_enter(struct cpuidle_device *dev, >>> + struct cpuidle_driver *drv, int index) >>> +{ >>> + unsigned core =3D cpu_data[dev->cpu].core; >>> + unsigned online, first_cpu, num_left; >>> + cpumask_var_t coupled_mask, vpe_mask; >>> + >>> + if (!alloc_cpumask_var(&coupled_mask, GFP_KERNEL)) >>> + return -ENOMEM; >>> + >>> + if (!alloc_cpumask_var(&vpe_mask, GFP_KERNEL)) { >>> + free_cpumask_var(coupled_mask); >>> + return -ENOMEM; >>> + } >> >> You can't do that in this function where the local irqs are disabled= =2E IMO, >> if you set CONFIG_CPUMASK_OFFSTACK and CONFIG_DEBUG_ATOMIC_SLEEP, yo= u should >> see a kernel warning. > > Right you are, it should either use GFP_ATOMIC or just not handle the > off-stack case (which isn't currently used). Why don't you just allocate these cpumasks at init time instead of=20 allocating / freeing them in the fast path ? >>> + /* Calculate which coupled CPUs (VPEs) are online */ >>> +#ifdef CONFIG_MIPS_MT >>> + cpumask_and(coupled_mask, cpu_online_mask, &dev->coupled_cpus); >>> + first_cpu =3D cpumask_first(coupled_mask); >>> + online =3D cpumask_weight(coupled_mask); >>> + cpumask_clear_cpu(dev->cpu, coupled_mask); >>> + cpumask_shift_right(vpe_mask, coupled_mask, >>> + cpumask_first(&dev->coupled_cpus)); >> >> What is the purpose of this computation ? > > If you read through the code generated in cps_gen_entry_code, the > vpe_mask is used to indicate the VPEs within the current core which a= re > both online & not the VPE currently running the code. Ok, thanks for the clarification. Shouldn't 'online =3D cpumask_weight(coupled_mask)' be after clearing t= he=20 current cpu ? Otherwise, online has also the current cpu which, IIUC,=20 shouldn't be included, no ? >>> +#else >>> + cpumask_clear(coupled_mask); >>> + cpumask_clear(vpe_mask); >>> + first_cpu =3D dev->cpu; >>> + online =3D 1; >>> +#endif >> >> first_cpu is not used. > > Right you are. It's used in further work-in-progress patches but I'll > remove it from this one. > >> >>> + /* >>> + * Run the generated entry code. Note that we assume the number o= f VPEs >>> + * within this core does not exceed the width in bits of a long. = Since >>> + * MVPConf0.PVPE is 4 bits wide this seems like a safe assumption= =2E >>> + */ >>> + num_left =3D per_cpu(ncwait_asm_enter, core)(vpe_mask->bits[0], o= nline); >>> + >>> + /* >>> + * If this VPE is the first to leave the non-coherent wait state = then >>> + * it needs to wake up any coupled VPEs still running their wait >>> + * instruction so that they return to cpuidle, >> >> Why is it needed ? Can't the other cpus stay idle ? > > Not with the current cpuidle code. Please see the end of > cpuidle_enter_state_coupled in drivers/cpuidle/coupled.c. The code wa= its > for all coupled CPUs to exit the idle state before any of them procee= d. > Whilst I suppose it would be possible to modify cpuidle to not requir= e > that, it would leave you with inaccurate residence statistics mention= ed > below. > >> >>> * which can then complete >>> + * coordination between the coupled VPEs & provide the governor w= ith >>> + * a chance to reflect on the length of time the VPEs were in the >>> + * idle state. >>> + */ >>> + if (coupled_coherence && (num_left =3D=3D online)) >>> + arch_send_call_function_ipi_mask(coupled_mask); >> >> Except there is no choice due to hardware limitations, I don't think= this is >> valid. > > By nature when one CPU (VPE) within a core leaves a non-coherent stat= e > the rest do too, because as I mentioned coherence is a property of th= e > core not of individual VPEs. It would be possible to leave the other > VPEs idle if we didn't differentiate between the coherent & non-coher= ent > wait states, but again not with cpuidle as it is today (due to the > waiting for all CPUs at the end of cpuidle_enter_state_coupled). [ ... ] >>> + >>> + /* >>> + * Set the coupled flag on the appropriate states if this system >>> + * requires it. >>> + */ >>> + if (coupled_coherence) >>> + for (i =3D 1; i < cps_driver.state_count; i++) >>> + cps_driver.states[i].flags |=3D CPUIDLE_FLAG_COUPLED; >> >> I am not sure CPUIDLE_FLAG_COUPLED is the solution for this driver. = IIUC, >> with the IPI above and the wakeup sync with the couple states, this = driver >> is waking up everybody instead of sleeping as much as possible. >> >> I would recommend to have a look at a different approach, like the o= ne used >> for cpuidle-ux500.c. > > Ok it looks like that just counts & performs work only on the last > online CPU to run the code. Yes, the last-man-standing approach. Usually, the couple idle states ar= e=20 used when the processors need to do some extra work or only the cpu0 ca= n=20 invoke PM routine for security reason. In your case, IIUC, we don't care who will call the PM routine and if a= =20 cpu wakes up, the other ones (read the VPEs belonging to the same core)= ,=20 can stay idle until an interrupt occurs. > As before that could work but only by > disregarding any differentiation between coherent & non-coherent wait > states at levels above the driver. May be you can use the arch private flags for the idle states to=20 differentiate coherent or non-coherent wait states ? Or alternatively=20 create two drivers and register the right one. >> -- >> Linaro.org =E2=94=82 Open source software= for ARM SoCs >> >> Follow Linaro: Facebook | >> Twitter | >> Blog >> --=20 Linaro.org =E2=94=82 Open source software fo= r ARM SoCs =46ollow Linaro: Facebook | Twitter | Blog