From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: deepthi@linux.vnet.ibm.com, linux-pm@vger.kernel.org,
daniel.lezcano@linaro.org, rjw@rjwysocki.net,
linux-kernel@vger.kernel.org, paulmck@linux.vnet.ibm.com,
linuxppc-dev@lists.ozlabs.org, tuukka.tikkanen@linaro.org
Subject: Re: [PATCH] cpuidle/menu: Fail cpuidle_idle_call() if no idle state is acceptable
Date: Tue, 14 Jan 2014 13:55:08 +0530 [thread overview]
Message-ID: <52D4F464.5070707@linux.vnet.ibm.com> (raw)
In-Reply-To: <52D4E07E.204@linux.vnet.ibm.com>
Hi Srivatsa,
On 01/14/2014 12:30 PM, Srivatsa S. Bhat wrote:
> On 01/14/2014 11:35 AM, Preeti U Murthy wrote:
>> On PowerPC, in a particular test scenario, all the cpu idle states were disabled.
>> Inspite of this it was observed that the idle state count of the shallowest
>> idle state, snooze, was increasing.
>>
>> This is because the governor returns the idle state index as 0 even in
>> scenarios when no idle state can be chosen. These scenarios could be when the
>> latency requirement is 0 or as mentioned above when the user wants to disable
>> certain cpu idle states at runtime. In the latter case, its possible that no
>> cpu idle state is valid because the suitable states were disabled
>> and the rest did not match the menu governor criteria to be chosen as the
>> next idle state.
>>
>> This patch adds the code to indicate that a valid cpu idle state could not be
>> chosen by the menu governor and reports back to arch so that it can take some
>> default action.
>>
>
> That sounds fair enough. However, the "default" action of pseries idle loop
> (pseries_lpar_idle()) surprises me. It enters Cede, which is _deeper_ than doing
> a snooze! IOW, a user might "disable" cpuidle or set the PM_QOS_CPU_DMA_LATENCY
> to 0 hoping to prevent the CPUs from going to deep idle states, but then the
> machine would still end up going to Cede, even though that wont get reflected
> in the idle state counts. IMHO that scenario needs some thought as well...
Yes I did see this, but since the patch intends to only communicate
whether the cpuidle governor was successful in choosing an idle state on
its part, I wished to address the default action of pseries idle loop
separately. You are right we will need to understand the patch which
introduced this action. I will take a look at it.
>
>> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>> ---
>>
>> drivers/cpuidle/cpuidle.c | 6 +++++-
>> drivers/cpuidle/governors/menu.c | 7 ++++---
>> 2 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>> index a55e68f..5bf06bb 100644
>> --- a/drivers/cpuidle/cpuidle.c
>> +++ b/drivers/cpuidle/cpuidle.c
>> @@ -131,8 +131,9 @@ int cpuidle_idle_call(void)
>>
>> /* ask the governor for the next state */
>> next_state = cpuidle_curr_governor->select(drv, dev);
>> +
>> + dev->last_residency = 0;
>> if (need_resched()) {
>> - dev->last_residency = 0;
>> /* give the governor an opportunity to reflect on the outcome */
>> if (cpuidle_curr_governor->reflect)
>> cpuidle_curr_governor->reflect(dev, next_state);
>
> The comments on top of the .reflect() routines of the governors say that the
> second parameter is the index of the actual state entered. But after this patch,
> next_state can be negative, indicating an invalid index. So those comments need
> to be updated accordingly.
Right, I will take care of the comment in the next post.
>
>> @@ -140,6 +141,9 @@ int cpuidle_idle_call(void)
>> return 0;
>> }
>>
>> + if (next_state < 0)
>> + return -EINVAL;
>
> The exit path above (due to need_resched) returns with irqs enabled, but the new
> one you are adding (next_state < 0) returns with irqs disabled. This is correct,
> because in the latter case, "idle" is still in progress and the arch will choose
> a default handler to execute (unlike the former case where "idle" is over and
> hence its time to enable interrupts).
Correct.
>
> IMHO it would be good to add comments around this code to explain this subtle
> difference. We can never be too careful with these things... ;-)
Ok, will do so.
>
>> +
>> trace_cpu_idle_rcuidle(next_state, dev->cpu);
>>
>> broadcast = !!(drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP);
>> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
>> index cf7f2f0..6921543 100644
>> --- a/drivers/cpuidle/governors/menu.c
>> +++ b/drivers/cpuidle/governors/menu.c
>> @@ -283,6 +283,7 @@ again:
>> * menu_select - selects the next idle state to enter
>> * @drv: cpuidle driver containing state data
>> * @dev: the CPU
>> + * Returns -1 when no idle state is suitable
>> */
>> static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>> {
>> @@ -292,17 +293,17 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>> int multiplier;
>> struct timespec t;
>>
>> - if (data->needs_update) {
>> + if (data->last_state_idx >= 0 && data->needs_update) {
> ^^^^^
> Doesn't hurt, but actually unnecessary, since ->needs_update is set to 1
> only when index >= 0.
Right we do not need this check. I was assuming that needs_update would
be consistent with the index >= 0 only in the need_resched() case. But
needs_update will get unset each time the governor is invoked to be set
only if index >= 0 thereafter.
>
>> menu_update(drv, dev);
>> data->needs_update = 0;
>> }
>>
>> - data->last_state_idx = 0;
>> + data->last_state_idx = -1;
>> data->exit_us = 0;
>>
>> /* Special case when user has set very strict latency requirement */
>> if (unlikely(latency_req == 0))
>> - return 0;
>> + return data->last_state_idx;
>>
>> /* determine the expected residency time, round up */
>> t = ktime_to_timespec(tick_nohz_get_sleep_length());
>>
>
> What about the ladder governor? I know its not used that much in practice,
> but I think it would be good to update that as well, just to keep it
> consistent.
Yes this needs to be updated as well. But the ladder governor has a few
other details to take care of in addition to what is taken care of in
the menu governor by this patch. Hence I will be posting that separately.
Thanks
Regards
Preeti U Murthy
>
> Regards,
> Srivatsa S. Bhat
>
prev parent reply other threads:[~2014-01-14 8:28 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-14 6:05 [PATCH] cpuidle/menu: Fail cpuidle_idle_call() if no idle state is acceptable Preeti U Murthy
2014-01-14 6:16 ` Deepthi Dharwar
2014-01-14 7:00 ` Srivatsa S. Bhat
2014-01-14 7:37 ` Srivatsa S. Bhat
2014-01-14 11:02 ` Preeti U Murthy
2014-01-14 8:00 ` Deepthi Dharwar
2014-01-14 8:25 ` Preeti U Murthy [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52D4F464.5070707@linux.vnet.ibm.com \
--to=preeti@linux.vnet.ibm.com \
--cc=daniel.lezcano@linaro.org \
--cc=deepthi@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=rjw@rjwysocki.net \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=tuukka.tikkanen@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).