linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: deepthi@linux.vnet.ibm.com, linux-pm@vger.kernel.org,
	daniel.lezcano@linaro.org, rjw@rjwysocki.net,
	linux-kernel@vger.kernel.org, paulmck@linux.vnet.ibm.com,
	linuxppc-dev@lists.ozlabs.org, tuukka.tikkanen@linaro.org
Subject: Re: [PATCH] cpuidle/menu: Fail cpuidle_idle_call() if no idle state is acceptable
Date: Tue, 14 Jan 2014 13:55:08 +0530	[thread overview]
Message-ID: <52D4F464.5070707@linux.vnet.ibm.com> (raw)
In-Reply-To: <52D4E07E.204@linux.vnet.ibm.com>

Hi Srivatsa,

On 01/14/2014 12:30 PM, Srivatsa S. Bhat wrote:
> On 01/14/2014 11:35 AM, Preeti U Murthy wrote:
>> On PowerPC, in a particular test scenario, all the cpu idle states were disabled.
>> Inspite of this it was observed that the idle state count of the shallowest
>> idle state, snooze, was increasing.
>>
>> This is because the governor returns the idle state index as 0 even in
>> scenarios when no idle state can be chosen. These scenarios could be when the
>> latency requirement is 0 or as mentioned above when the user wants to disable
>> certain cpu idle states at runtime. In the latter case, its possible that no
>> cpu idle state is valid because the suitable states were disabled
>> and the rest did not match the menu governor criteria to be chosen as the
>> next idle state.
>>
>> This patch adds the code to indicate that a valid cpu idle state could not be
>> chosen by the menu governor and reports back to arch so that it can take some
>> default action.
>>
> 
> That sounds fair enough. However, the "default" action of pseries idle loop
> (pseries_lpar_idle()) surprises me. It enters Cede, which is _deeper_ than doing
> a snooze! IOW, a user might "disable" cpuidle or set the PM_QOS_CPU_DMA_LATENCY
> to 0 hoping to prevent the CPUs from going to deep idle states, but then the
> machine would still end up going to Cede, even though that wont get reflected
> in the idle state counts. IMHO that scenario needs some thought as well...

Yes I did see this, but since the patch intends to only communicate
whether the cpuidle governor was successful in choosing an idle state on
its part, I wished to address the default action of pseries idle loop
separately. You are right we will need to understand the patch which
introduced this action. I will take a look at it.

> 
>> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>> ---
>>
>>  drivers/cpuidle/cpuidle.c        |    6 +++++-
>>  drivers/cpuidle/governors/menu.c |    7 ++++---
>>  2 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>> index a55e68f..5bf06bb 100644
>> --- a/drivers/cpuidle/cpuidle.c
>> +++ b/drivers/cpuidle/cpuidle.c
>> @@ -131,8 +131,9 @@ int cpuidle_idle_call(void)
>>
>>  	/* ask the governor for the next state */
>>  	next_state = cpuidle_curr_governor->select(drv, dev);
>> +
>> +	dev->last_residency = 0;
>>  	if (need_resched()) {
>> -		dev->last_residency = 0;
>>  		/* give the governor an opportunity to reflect on the outcome */
>>  		if (cpuidle_curr_governor->reflect)
>>  			cpuidle_curr_governor->reflect(dev, next_state);
> 
> The comments on top of the .reflect() routines of the governors say that the
> second parameter is the index of the actual state entered. But after this patch,
> next_state can be negative, indicating an invalid index. So those comments need
> to be updated accordingly.

Right, I will take care of the comment in the next post.
> 
>> @@ -140,6 +141,9 @@ int cpuidle_idle_call(void)
>>  		return 0;
>>  	}
>>
>> +	if (next_state < 0)
>> +		return -EINVAL;
> 
> The exit path above (due to need_resched) returns with irqs enabled, but the new
> one you are adding (next_state < 0) returns with irqs disabled. This is correct,
> because in the latter case, "idle" is still in progress and the arch will choose
> a default handler to execute (unlike the former case where "idle" is over and
> hence its time to enable interrupts).

Correct.
> 
> IMHO it would be good to add comments around this code to explain this subtle
> difference. We can never be too careful with these things... ;-)

Ok, will do so.
> 
>> +
>>  	trace_cpu_idle_rcuidle(next_state, dev->cpu);
>>
>>  	broadcast = !!(drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP);
>> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
>> index cf7f2f0..6921543 100644
>> --- a/drivers/cpuidle/governors/menu.c
>> +++ b/drivers/cpuidle/governors/menu.c
>> @@ -283,6 +283,7 @@ again:
>>   * menu_select - selects the next idle state to enter
>>   * @drv: cpuidle driver containing state data
>>   * @dev: the CPU
>> + * Returns -1 when no idle state is suitable
>>   */
>>  static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>>  {
>> @@ -292,17 +293,17 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>>  	int multiplier;
>>  	struct timespec t;
>>
>> -	if (data->needs_update) {
>> +	if (data->last_state_idx >= 0 && data->needs_update) {
>                ^^^^^
> Doesn't hurt, but actually unnecessary, since ->needs_update is set to 1
> only when index >= 0.

Right we do not need this check. I was assuming that needs_update would
be consistent with the index >= 0 only in the need_resched() case. But
needs_update will get unset each time the governor is invoked to be set
only if index >= 0 thereafter.

> 
>>  		menu_update(drv, dev);
>>  		data->needs_update = 0;
>>  	}
>>
>> -	data->last_state_idx = 0;
>> +	data->last_state_idx = -1;
>>  	data->exit_us = 0;
>>
>>  	/* Special case when user has set very strict latency requirement */
>>  	if (unlikely(latency_req == 0))
>> -		return 0;
>> +		return data->last_state_idx;
>>
>>  	/* determine the expected residency time, round up */
>>  	t = ktime_to_timespec(tick_nohz_get_sleep_length());
>>
> 
> What about the ladder governor? I know its not used that much in practice,
> but I think it would be good to update that as well, just to keep it
> consistent.

Yes this needs to be updated as well. But the ladder governor has a few
other details to take care of in addition to what is taken care of in
the menu governor by this patch. Hence I will be posting that separately.

Thanks

Regards
Preeti U Murthy
> 
> Regards,
> Srivatsa S. Bhat
> 

      parent reply	other threads:[~2014-01-14  8:28 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-14  6:05 [PATCH] cpuidle/menu: Fail cpuidle_idle_call() if no idle state is acceptable Preeti U Murthy
2014-01-14  6:16 ` Deepthi Dharwar
2014-01-14  7:00 ` Srivatsa S. Bhat
2014-01-14  7:37   ` Srivatsa S. Bhat
2014-01-14 11:02     ` Preeti U Murthy
2014-01-14  8:00   ` Deepthi Dharwar
2014-01-14  8:25   ` Preeti U Murthy [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52D4F464.5070707@linux.vnet.ibm.com \
    --to=preeti@linux.vnet.ibm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=deepthi@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rjw@rjwysocki.net \
    --cc=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=tuukka.tikkanen@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).