From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Lezcano Subject: Re: [PATCH] cpuidle: Fix the CPU stuck at C0 for 2-3s after PM_QOS back to DEFAULT Date: Thu, 14 Aug 2014 15:29:49 +0200 Message-ID: <53ECB9CD.9040705@linaro.org> References: <1407982309-4863-1-git-send-email-chuansheng.liu@intel.com> <20140814110040.GI16043@twins.programming.kicks-ass.net> <53EC9A29.8090408@linaro.org> <20140814124135.GJ16043@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from mail-wi0-f169.google.com ([209.85.212.169]:53041 "EHLO mail-wi0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753902AbaHNN3y (ORCPT ); Thu, 14 Aug 2014 09:29:54 -0400 Received: by mail-wi0-f169.google.com with SMTP id n3so9792987wiv.2 for ; Thu, 14 Aug 2014 06:29:52 -0700 (PDT) In-Reply-To: <20140814124135.GJ16043@twins.programming.kicks-ass.net> Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: Peter Zijlstra Cc: Chuansheng Liu , "Rafael J. Wysocki" , "linux-pm@vger.kernel.org" , LKML , changcheng.liu@intel.com, xiaoming.wang@intel.com, souvik.k.chakravarty@intel.com On 08/14/2014 02:41 PM, Peter Zijlstra wrote: > On Thu, Aug 14, 2014 at 01:14:49PM +0200, Daniel Lezcano wrote: >> On 08/14/2014 01:00 PM, Peter Zijlstra wrote: >>> On Thu, Aug 14, 2014 at 12:29:32PM +0200, Daniel Lezcano wrote: >>>> Hi Chuansheng, >>>> >>>> On 14 August 2014 04:11, Chuansheng Liu = wrote: >>>> >>>>> We found sometimes even after we let PM_QOS back to DEFAULT, >>>>> the CPU still stuck at C0 for 2-3s, don't do the new suitable C-s= tate >>>>> selection immediately after received the IPI interrupt. >>>>> >>>>> The code model is simply like below: >>>>> { >>>>> pm_qos_update_request(&pm_qos, C1 - 1); >>>>> < =3D=3D Here keep all cores at C0 >>>>> ...; >>>>> pm_qos_update_request(&pm_qos, PM_QOS_DEFAULT_VALUE); >>>>> < =3D=3D Here some cores still stuck at C0 for 2= -3s >>>>> } >>>>> >>>>> The reason is when pm_qos come back to DEFAULT, there is IPI inte= rrupt to >>>>> wake up the core, but when core is in poll idle state, the IPI in= terrupt >>>>> can not break the polling loop. >>> >>> So seeing how you're from @intel.com I'm assuming you're using x86 = here. >>> >>> I'm not seeing how this can be possible, MWAIT is interrupted by IP= Is >>> just fine, which means we'll fall out of the cpuidle_enter(), which >>> means we'll cpuidle_reflect(), and then leave cpuidle_idle_call(). >>> >>> It will indeed not leave the cpu_idle_loop() function and go right = back >>> into cpuidle_idle_call(), but that will then call cpuidle_select() = which >>> should pick a new C state. >>> >>> So the interrupt _should_ work. If it doesn't you need to explain w= hy. >> >> I think the issue is related to the poll_idle state, in >> drivers/cpuidle/driver.c. This state is x86 specific and inserted in= the >> cpuidle table as the state 0 (POLL). There is no mwait for this stat= e. It is >> a bit confusing because this state is not listed in the acpi / intel= idle >> driver but inserted implicitly at the beginning of the idle table by= the >> cpuidle framework when the driver is registered. >> >> static int poll_idle(struct cpuidle_device *dev, >> struct cpuidle_driver *drv, int index) >> { >> local_irq_enable(); >> if (!current_set_polling_and_test()) { >> while (!need_resched()) >> cpu_relax(); >> } >> current_clr_polling(); >> >> return index; >> } > > Ah, well, in that case there's a ton more broken than just this. > kick_all_cpus_sync() won't work either, and cpuidle_reflect() pretty > much expects to be called after each interrupt. Agree. > Then again, not reflecting properly isn't really a problem, its not l= ike > not accounting interrupts is going to safe power much. I think the main issue here is to exit the poll_idle loop when an IPI i= s=20 received. IIUC, there is a pm_qos user, perhaps a driver (Chuansheng ca= n=20 give more details), setting a very short latency, so the cpuidle=20 framework choose a shallow state like the poll_idle and then the driver= =20 sets a bigger latency, leading to the IPI to wake all the cpus. As the=20 CPUs are in the poll_idle, they don't exit until an event make them to=20 exit the need_resched() loop (reschedule or whatever). This situation=20 can let the CPUs to stand in the infinite loop several seconds while we= =20 are expecting them to exit the poll_idle and enter a deeper idle state,= =20 thus with an extra energy consumption. --=20 Linaro.org =E2=94=82 Open source software fo= r ARM SoCs =46ollow Linaro: Facebook | Twitter | Blog