From mboxrd@z Thu Jan 1 00:00:00 1970 From: daniel.lezcano@linaro.org (Daniel Lezcano) Date: Tue, 03 Mar 2015 16:20:26 +0100 Subject: [PATCH] cpuidle: mvebu: Fix the CPU PM notifier usage In-Reply-To: <54F5CC0D.6060509@libero.it> References: <1424971248-29076-1-git-send-email-gregory.clement@free-electrons.com> <4892716.Bf612zPN6E@vostro.rjw.lan> <54F03B51.1010708@free-electrons.com> <54F58D2F.2010907@linaro.org> <54F58E3B.9040302@free-electrons.com> <54F59285.5060306@libero.it> <54F5AE5C.6060006@free-electrons.com> <54F5CC0D.6060509@libero.it> Message-ID: <54F5D13A.1020002@linaro.org> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 03/03/2015 03:58 PM, Fulvio wrote: > >> I didn't know you experimented random kernel panics and that you thought >> it was related to the CPU Idle driver. >> >>> All i can say is that the system use the "armadaxp_idle" driver and >>> works fine when running "stress --cpu 8" in background. >>> I asked Netgear to provide a firmware without the idle driver to >>> confirm if it's the cause of the problem, but they did not answered. >> >> I think that if you disable all the state using by doing an >> >> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/stateNUM/disable >> >> where NUM is the nuymbert of the state. Then it should disable the >> cpuidle on the fly. > Thanks, i'll try that as soon as possible (i gave back the unit to my > client) and report back. > > However, the description of the cpu_pm_enter function state: > "Must be called on the affected CPU with interrupts disabled. Platform > is responsible for ensuring that cpu_pm_enter is not called twice on the > same CPU before cpu_pm_exit is called. Notified drivers can include VFP > co-processor, interrupt controller and its PM extensions, local CPU > timers context save/restore which shouldn't be interrupted. Hence it > must be called with interrupts disabled." > > and the point is: it that an invariant? Do current code and future code > safely assume that cpu_pm_enter is not called twice? The fix is correct. The cpu_pm_enter/exit symmetry must be kept because we don't know what the notifier clients are doing. The point is : can we send it to stable@ as a bug fix or not ? > For example if cpu_pm_enter do "context save" and cpu_pm_exit do > "context restore", calling twice cpu_pm_enter will overwrite the > previous saved context: is that safe in all circumstances? That is the drawback of the notifiers: the kernel provides a service and everyone plug something on it. The cpu_pm notifier are very low level functions, so the answer of your question is not obvious. I already checked all the cpuidle drivers if the potential bug you reported is there or not but apparently everything else is fine, cpu_pm_exit is always called after cpu_pm_enter. As you stated, the API description implies cpu_pm_exit must be called after cpu_pm_enter. So the fix is right. > I assume the rule " It must fix a real bug that bothers people (not a, > "This could be a problem..." type thing)." is to avoid committing > useless changes that may introduce new bugs, but i do not think that > apply to this case: a bug report from an unknown user (me) should change > nothing. It would be perfect if we can succeed to reproduce the bug you are facing and check the patch fixes it. In this case, it goes to stable@ automatically. -- Linaro.org ? Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog