From: Daniel Lezcano <daniel.lezcano@linaro.org>
To: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
Nicolas Pitre <nicolas.pitre@linaro.org>,
"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Lists linaro-kernel <linaro-kernel@lists.linaro.org>,
patches@linaro.org
Subject: Re: [PATCH V2 1/5] sched: idle: cpuidle: Check the latency req before idle
Date: Thu, 06 Nov 2014 13:27:30 +0100 [thread overview]
Message-ID: <545B6932.4010308@linaro.org> (raw)
In-Reply-To: <545AF424.2070302@linux.vnet.ibm.com>
On 11/06/2014 05:08 AM, Preeti U Murthy wrote:
> On 11/05/2014 07:58 PM, Daniel Lezcano wrote:
>> On 10/29/2014 03:01 AM, Preeti U Murthy wrote:
>>> On 10/29/2014 12:29 AM, Daniel Lezcano wrote:
>>>> On 10/28/2014 04:51 AM, Preeti Murthy wrote:
>>>>> Hi Daniel,
>>>>>
>>>>> On Thu, Oct 23, 2014 at 2:31 PM, Daniel Lezcano
>>>>> <daniel.lezcano@linaro.org> wrote:
>>>>>> When the pmqos latency requirement is set to zero that means "poll in
>>>>>> all the
>>>>>> cases".
>>>>>>
>>>>>> That is correctly implemented on x86 but not on the other archs.
>>>>>>
>>>>>> As how is written the code, if the latency request is zero, the
>>>>>> governor will
>>>>>> return zero, so corresponding, for x86, to the poll function, but for
>>>>>> the
>>>>>> others arch the default idle function. For example, on ARM this is
>>>>>> wait-for-
>>>>>> interrupt with a latency of '1', so violating the constraint.
>>>>>
>>>>> This is not true actually. On PowerPC the idle state 0 has an
>>>>> exit_latency of 0.
>>>>>
>>>>>>
>>>>>> In order to fix that, do the latency requirement check *before*
>>>>>> calling the
>>>>>> cpuidle framework in order to jump to the poll function without
>>>>>> entering
>>>>>> cpuidle. That has several benefits:
>>>>>
>>>>> Doing so actually hurts on PowerPC. Because the idle loop defined for
>>>>> idle state 0 is different from what cpu_relax() does in
>>>>> cpu_idle_loop().
>>>>> The spinning is more power efficient in the former case. Moreover we
>>>>> also set
>>>>> certain register values which indicate an idle cpu. The ppc_runlatch
>>>>> bits
>>>>> do precisely this. These register values are being read by some user
>>>>> space
>>>>> tools. So we will end up breaking them with this patch
>>>>>
>>>>> My suggestion is very well keep the latency requirement check in
>>>>> kernel/sched/idle.c
>>>>> like your doing in this patch. But before jumping to cpu_idle_loop
>>>>> verify if the
>>>>> idle state 0 has an exit_latency > 0 in addition to your check on the
>>>>> latency_req == 0.
>>>>> If not, you can fall through to the regular path of calling into the
>>>>> cpuidle driver.
>>>>> The scheduler can query the cpuidle_driver structure anyway.
>>>>>
>>>>> What do you think?
>>>>
>>>> Thanks for reviewing the patch and spotting this.
>>>>
>>>> Wouldn't make sense to create:
>>>>
>>>> void __weak_cpu_idle_poll(void) ?
>>>>
>>>> and override it with your specific poll function ?
>>>>
>>>
>>> No this would become ugly as far as I can see. A weak function has to be
>>> defined under arch/* code. We will either need to duplicate the idle
>>> loop that we already have in the drivers or point the weak function to
>>> the first idle state defined by our driver. Both of which is not
>>> desirable (calling into the driver from arch code is ugly). Another
>>> reason why I don't like the idea of a weak function is that if you have
>>> missed looking at a specific driver and they have an idle loop with
>>> features similar to on powerpc, you will have to spot it yourself and
>>> include the arch specific cpu_idle_poll() for them.
>>
>> Yes, I agree this is a fair point. But actually I don't see the interest
>> of having the poll loop in the cpuidle driver. These cleanups are
>
> We can't do that simply because the idle poll loop has arch specific
> bits on powerpc.
I am not sure.
Could you describe what is the difference between the arch_cpu_idle
function in arch/arm/powerpc/kernel/idle.c and the 0th power PC idle state ?
Is it kind of duplicate ?
And for polling, do you really want to use while (...); cpu_relax(); as
it is x86 specific ? instead of the powerpc's arch_idle ?
Today, if latency_req == 0, it returns the 0th idle state, so polling.
If we jump to the arch_cpu_idle_poll, the result will be the same for
all architecture.
>> preparing the removal of the CPUIDLE_DRIVER_STATE_START macro which
>> leads to a lot of mess in the cpuidle code.
>
> How is the suggestion to check the exit_latency of idle state 0 when
> latency_req == 0 going to hinder this removal?
It sounds a bit hackish. I prefer to sort out the current situation.
And by the way, what is the reasoning behind having a target_residency /
exit_latency equal to zero for an idle state ?
All this sounds really fuzzy for me.
>> With the removal of this macro, we should be able to move the select
>> loop from the menu governor and use it everywhere else. Furthermore,
>> this state which is flagged with TIME_VALID, isn't because the local
>> interrupt are enabled so we are measuring the interrupt time processing.
>> Beside that the idle loop for x86 is mostly not used.
>>
>> So the idea would be to extract those idle loop from the drivers and use
>> them directly when:
>> 1. the idle selection fails (use the poll loop under certain
>> circumstances we have to redefine)
>
> This behavior will not change as per my suggestion.
>
>> 2. when the latency req is zero
>
> Its only here that I suggested you also verify state 0's exit_latency.
> For the reason that the arch may have a more optimized idle poll loop,
> which we cannot override with the generic cpuidle poll loop.
>
> Regards
> Preeti U Murthy
>>
>> That will result in a cleaner code in cpuidle and in the governor.
>>
>> Do you agree with that ?
>>
>>> But by having a check on the exit_latency, you are claiming that since
>>> the driver's 0th idle state is no better than the generic idle loop in
>>> cases of 0 latency req, we are better off calling the latter, which
>>> looks reasonable. That way you don't have to bother about worsening the
>>> idle loop behavior on any other driver.
>>
>>
>>
>>
>>
>
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
next prev parent reply other threads:[~2014-11-06 12:27 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-23 9:01 [PATCH V2 1/5] sched: idle: cpuidle: Check the latency req before idle Daniel Lezcano
2014-10-23 9:01 ` [PATCH V2 2/5] sched: idle: Get the next timer event and pass it the cpuidle framework Daniel Lezcano
2014-10-23 9:01 ` [PATCH V2 3/5] cpuidle: idle: menu: Don't reflect when a state selection failed Daniel Lezcano
2014-10-28 2:01 ` Len Brown
2014-10-28 19:15 ` Daniel Lezcano
2014-10-28 7:01 ` Preeti Murthy
2014-10-28 18:28 ` Daniel Lezcano
2014-10-29 1:44 ` Preeti U Murthy
2014-10-29 16:54 ` Kevin Hilman
2014-10-29 16:54 ` Kevin Hilman
2014-10-29 21:11 ` Rafael J. Wysocki
2014-10-23 9:01 ` [PATCH V2 4/5] cpuidle: menu: Fix the get_typical_interval Daniel Lezcano
2014-10-23 16:43 ` Nicolas Pitre
2014-10-28 2:48 ` Len Brown
2014-10-29 18:15 ` Daniel Lezcano
2014-10-23 9:01 ` [PATCH V2 5/5] cpuidle: menu: Move the update function before its declaration Daniel Lezcano
2014-10-23 16:47 ` Nicolas Pitre
2014-10-28 2:53 ` Len Brown
2014-10-28 3:51 ` [PATCH V2 1/5] sched: idle: cpuidle: Check the latency req before idle Preeti Murthy
2014-10-28 18:59 ` Daniel Lezcano
2014-10-29 2:01 ` Preeti U Murthy
2014-11-05 14:28 ` Daniel Lezcano
2014-11-06 4:08 ` Preeti U Murthy
2014-11-06 12:27 ` Daniel Lezcano [this message]
2014-11-07 4:23 ` Preeti U Murthy
2014-11-06 13:42 ` Daniel Lezcano
2014-11-07 4:29 ` Preeti U Murthy
2014-11-07 9:35 ` Daniel Lezcano
2014-11-05 21:57 ` Rafael J. Wysocki
2014-11-05 21:41 ` Daniel Lezcano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=545B6932.4010308@linaro.org \
--to=daniel.lezcano@linaro.org \
--cc=linaro-kernel@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=nicolas.pitre@linaro.org \
--cc=patches@linaro.org \
--cc=peterz@infradead.org \
--cc=preeti@linux.vnet.ibm.com \
--cc=rjw@rjwysocki.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.