From: Trinabh Gupta <trinabh@linux.vnet.ibm.com>
To: Venkatesh Pallipadi <venki@google.com>,
Arjan van de Ven <arjan@linux.intel.com>,
peterz@infradead.org
Cc: lenb@kernel.org, suresh.b.siddha@intel.com,
benh@kernel.crashing.org, linux-kernel@vger.kernel.org,
ak@linux.intel.com
Subject: Re: [RFC V1] cpuidle: add idle routine registration and cleanup pm_idle pointer
Date: Wed, 20 Oct 2010 20:42:21 +0530 [thread overview]
Message-ID: <4CBF06D5.7020508@linux.vnet.ibm.com> (raw)
In-Reply-To: <4CBDEB14.2030304@linux.vnet.ibm.com>
On 10/20/2010 12:31 AM, Trinabh Gupta wrote:
>
>
> On 10/20/2010 12:19 AM, Venkatesh Pallipadi wrote:
>> On Tue, Oct 19, 2010 at 11:38 AM, Arjan van de Ven
>> <arjan@linux.intel.com> wrote:
>>> On 10/19/2010 11:36 AM, Trinabh Gupta wrote:
>>>>
>>>> The core of the kernel's idle routine on x86 presently depends on an
>>>> exported pm_idle function pointer that is unmanaged and causing
>>>> hazard to various subsystems when they save and restore it.
>>>> The first problem is that this exported pointer can be modified/flipped
>>>> by any subsystem. There is no tracking or notification mechanism.
>>>> Secondly and more importantly, various subsystems save the value of
>>>> this pointer, flip it and later restore to the saved value. There is
>>>> no guarantee that the saved value is still valid. The problem has
>>>> been discussed in [2,3] and Peter Zijlstra suggested removing pm_idle
>>>> and implementing a list based registration [1].
>>>>
>>>> This patch is an initial RFC implementation for x86 architecture
>>>> only. This framework can be generalised for other archs and also
>>>> include the current cpuidle framework for managing multiple idle
>>>> routines.
>>>>
>>>> Tests done with the patch:
>>>> ------------------------
>>>> 1. Build (on 2.6.36-rc7) and booted on x86 with C1E as deepest idle
>>>> state and current_idle was selected to be mwait_idle.
>>>>
>>>> 2. Build (on 2.5.36-rc8) and booted on x86 (Nehalem) with ACPI C3 as
>>>> deepest sleep state. The current_idle was selected to be
>>>> cpuidle_idle_call which is the cpuidle subsystem that will further
>>>> select idle routines from {C1,C2,C3}.
>>>>
>>>> Future implementation will try to eliminate this hirearchy and have
>>>> a single registration and menu/idle cpuidle governor for selection
>>>> of idle routine.
>>>
>>>
>>> looks like you're duplicating the cpuidle subsystem
>>>
>>> how about biting the bullet and just always and only use the cpuidle
>>> subsystem for all idle on x86 ?
>>>
>>
>> I agree with Arjan.
>> If we have a default_cpuidle driver which parses idle= params, handles
>> various mwait quirks in x86 process*.c and registers with cpuidle, we
>> can then always call cpuidle idle routine on x86.
>
> This wouldn't duplicate code. It would move parts/functionality of
> cpuidle into the kernel, keeping governors alone as modules.
>
> If we directly call cpuidle_idle_call() then this may be too much
> overhead for architectures that have single idle routine i.e cases where
> cpuidle is not used and will be seen as a bloat. (c.f goals 4a,b).
Hi Venki, Arjan
Building cpuidle into the kernel would add ~7KB for everyone, even
x86 architectures having only single idle state/routine. Andi Kleen had
pointed this out before. Please see http://lkml.org/lkml/2009/10/12/421.
% size drivers/cpuidle/built-in.o
text data bss dec hex filename
6054 1288 44 7386 1cda drivers/cpuidle/built-in.o
One solution is to move some of the functionality into the
kernel and keep governors as module. Currently this prototype
adds ~ 0.5KB and also does more robust management of idle routines.
The bottom line is that pm_idle needs to be removed one way or
the other.
% size arch/x86/kernel/built-in.o
ext data bss dec hex filename
Without this patch:
341328 445821 507713 1294862 13c20e arch/x86/kernel/built-in.o
With this patch:
341732 445885 507713 1295330 13c3e2 arch/x86/kernel/built-in.o
Increase: 468 Bytes
Thanks,
-Trinabh
next prev parent reply other threads:[~2010-10-20 15:12 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-19 18:36 [RFC V1] cpuidle: add idle routine registration and cleanup pm_idle pointer Trinabh Gupta
2010-10-19 18:38 ` Arjan van de Ven
2010-10-19 18:49 ` Venkatesh Pallipadi
2010-10-19 19:01 ` Trinabh Gupta
2010-10-20 15:12 ` Trinabh Gupta [this message]
2010-10-20 15:18 ` Arjan van de Ven
2010-10-20 15:34 ` Andi Kleen
2010-10-20 16:03 ` Arjan van de Ven
2010-10-20 19:19 ` Vaidyanathan Srinivasan
2010-10-20 19:25 ` Arjan van de Ven
2010-10-20 19:28 ` Peter Zijlstra
2010-10-20 19:29 ` Arjan van de Ven
2010-10-20 19:40 ` Vaidyanathan Srinivasan
2010-10-20 19:44 ` Arjan van de Ven
2010-10-20 19:47 ` Venkatesh Pallipadi
2010-10-20 20:03 ` Vaidyanathan Srinivasan
2010-10-20 20:47 ` Arjan van de Ven
2010-10-20 21:19 ` Venkatesh Pallipadi
2010-10-20 20:55 ` Dipankar Sarma
2010-10-20 15:57 ` Trinabh Gupta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CBF06D5.7020508@linux.vnet.ibm.com \
--to=trinabh@linux.vnet.ibm.com \
--cc=ak@linux.intel.com \
--cc=arjan@linux.intel.com \
--cc=benh@kernel.crashing.org \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=suresh.b.siddha@intel.com \
--cc=venki@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.