From: Trinabh Gupta <trinabh@linux.vnet.ibm.com>
To: Venkatesh Pallipadi <venki@google.com>,
Arjan van de Ven <arjan@linux.intel.com>,
peterz@infradead.org
Cc: lenb@kernel.org, suresh.b.siddha@intel.com,
benh@kernel.crashing.org, linux-kernel@vger.kernel.org,
ak@linux.intel.com
Subject: Re: [RFC V1] cpuidle: add idle routine registration and cleanup pm_idle pointer
Date: Wed, 20 Oct 2010 20:42:21 +0530 [thread overview]
Message-ID: <4CBF06D5.7020508@linux.vnet.ibm.com> (raw)
In-Reply-To: <4CBDEB14.2030304@linux.vnet.ibm.com>
On 10/20/2010 12:31 AM, Trinabh Gupta wrote:
>
>
> On 10/20/2010 12:19 AM, Venkatesh Pallipadi wrote:
>> On Tue, Oct 19, 2010 at 11:38 AM, Arjan van de Ven
>> <arjan@linux.intel.com> wrote:
>>> On 10/19/2010 11:36 AM, Trinabh Gupta wrote:
>>>>
>>>> The core of the kernel's idle routine on x86 presently depends on an
>>>> exported pm_idle function pointer that is unmanaged and causing
>>>> hazard to various subsystems when they save and restore it.
>>>> The first problem is that this exported pointer can be modified/flipped
>>>> by any subsystem. There is no tracking or notification mechanism.
>>>> Secondly and more importantly, various subsystems save the value of
>>>> this pointer, flip it and later restore to the saved value. There is
>>>> no guarantee that the saved value is still valid. The problem has
>>>> been discussed in [2,3] and Peter Zijlstra suggested removing pm_idle
>>>> and implementing a list based registration [1].
>>>>
>>>> This patch is an initial RFC implementation for x86 architecture
>>>> only. This framework can be generalised for other archs and also
>>>> include the current cpuidle framework for managing multiple idle
>>>> routines.
>>>>
>>>> Tests done with the patch:
>>>> ------------------------
>>>> 1. Build (on 2.6.36-rc7) and booted on x86 with C1E as deepest idle
>>>> state and current_idle was selected to be mwait_idle.
>>>>
>>>> 2. Build (on 2.5.36-rc8) and booted on x86 (Nehalem) with ACPI C3 as
>>>> deepest sleep state. The current_idle was selected to be
>>>> cpuidle_idle_call which is the cpuidle subsystem that will further
>>>> select idle routines from {C1,C2,C3}.
>>>>
>>>> Future implementation will try to eliminate this hirearchy and have
>>>> a single registration and menu/idle cpuidle governor for selection
>>>> of idle routine.
>>>
>>>
>>> looks like you're duplicating the cpuidle subsystem
>>>
>>> how about biting the bullet and just always and only use the cpuidle
>>> subsystem for all idle on x86 ?
>>>
>>
>> I agree with Arjan.
>> If we have a default_cpuidle driver which parses idle= params, handles
>> various mwait quirks in x86 process*.c and registers with cpuidle, we
>> can then always call cpuidle idle routine on x86.
>
> This wouldn't duplicate code. It would move parts/functionality of
> cpuidle into the kernel, keeping governors alone as modules.
>
> If we directly call cpuidle_idle_call() then this may be too much
> overhead for architectures that have single idle routine i.e cases where
> cpuidle is not used and will be seen as a bloat. (c.f goals 4a,b).
Hi Venki, Arjan
Building cpuidle into the kernel would add ~7KB for everyone, even
x86 architectures having only single idle state/routine. Andi Kleen had
pointed this out before. Please see http://lkml.org/lkml/2009/10/12/421.
% size drivers/cpuidle/built-in.o
text data bss dec hex filename
6054 1288 44 7386 1cda drivers/cpuidle/built-in.o
One solution is to move some of the functionality into the
kernel and keep governors as module. Currently this prototype
adds ~ 0.5KB and also does more robust management of idle routines.
The bottom line is that pm_idle needs to be removed one way or
the other.
% size arch/x86/kernel/built-in.o
ext data bss dec hex filename
Without this patch:
341328 445821 507713 1294862 13c20e arch/x86/kernel/built-in.o
With this patch:
341732 445885 507713 1295330 13c3e2 arch/x86/kernel/built-in.o
Increase: 468 Bytes
Thanks,
-Trinabh
next prev parent reply other threads:[~2010-10-20 15:12 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-19 18:36 [RFC V1] cpuidle: add idle routine registration and cleanup pm_idle pointer Trinabh Gupta
2010-10-19 18:38 ` Arjan van de Ven
2010-10-19 18:49 ` Venkatesh Pallipadi
2010-10-19 19:01 ` Trinabh Gupta
2010-10-20 15:12 ` Trinabh Gupta [this message]
2010-10-20 15:18 ` Arjan van de Ven
2010-10-20 15:34 ` Andi Kleen
2010-10-20 16:03 ` Arjan van de Ven
2010-10-20 19:19 ` Vaidyanathan Srinivasan
2010-10-20 19:25 ` Arjan van de Ven
2010-10-20 19:28 ` Peter Zijlstra
2010-10-20 19:29 ` Arjan van de Ven
2010-10-20 19:40 ` Vaidyanathan Srinivasan
2010-10-20 19:44 ` Arjan van de Ven
2010-10-20 19:47 ` Venkatesh Pallipadi
2010-10-20 20:03 ` Vaidyanathan Srinivasan
2010-10-20 20:47 ` Arjan van de Ven
2010-10-20 21:19 ` Venkatesh Pallipadi
2010-10-20 20:55 ` Dipankar Sarma
2010-10-20 15:57 ` Trinabh Gupta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CBF06D5.7020508@linux.vnet.ibm.com \
--to=trinabh@linux.vnet.ibm.com \
--cc=ak@linux.intel.com \
--cc=arjan@linux.intel.com \
--cc=benh@kernel.crashing.org \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=suresh.b.siddha@intel.com \
--cc=venki@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).