* [RFC] A New Power Management API
@ 2005-04-15 2:46 Adam Belay
2005-04-15 8:21 ` Benjamin Herrenschmidt
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Adam Belay @ 2005-04-15 2:46 UTC (permalink / raw)
To: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 7483 bytes --]
Hi all,
I'm in the process of designing a new power management API. I tried to
incorporate some of the ideas and suggestions made during past
discussions. Also, I've been studying the requirements of the PCI and
ACPI power specifications.
My current efforts are as follows:
The new API uses a power container/domain model. As an example, a PCI
or USB bridge can be considered a power domain that contains all of its
child devices. The API also provides a concept of power resources.
These are specific clocks and power planes that are enabled and disabled
based on the requirements of a device at given state. A power resource
can belong to a power domain, or be in the system pool. Power resources
are particularly useful for supporting ACPI, but I suspect they will be
needed in other places as well.
Power transitions are controlled by "power drivers". Buses with power
management features may provide power drivers to control their devices.
Device driver authors also have the option of implementing their own
special "power driver". In short, power transition code has been
separated from "struct bus_type". I think this will be more robust.
I made some minimal attempts at defining power policy managers. I will
likely expand this area as it is further discussed. Policy managers
will probably live inside the device driver code.
One interesting aspect of this implementation is that it allows each
device to have its own unique set of power states. It also provides a
mechanism to describe power requirements from the parent domain and/or
power resources on a per-state basis. I tried to make the design impose
as few limits and policies as possible.
In sysfs, a tree of power devices could be created to reflect the power
dependencies and topology. A power device may or may not belong to a
"struct device". If it does, a link will be created to the device node.
More to come, but for now I've included the header file. I look forward
to any comments or suggestions.
Thanks,
Adam
/*
* pm.h - the Power Management Interface
*
*/
#ifndef _LINUX_PM_H
#define _LINUX_PM_H
#ifdef __KERNEL__
#include <linux/config.h>
#include <linux/list.h>
#include <asm/atomic.h>
struct device;
struct power_resource;
struct power_controller;
struct power_policy;
struct power_device;
/*
* Global System Power States
*
* Reflect the status of the overall system.
*/
struct system_power_state {
unsigned int state;
unsigned int flags;
struct list_head state_list;
};
extern int pm_register_system_state(struct system_power_state *state);
extern void pm_unregister_system_state(struct system_power_state *state);
extern struct sys_power_state *
pm_get_system_state_data(unsigned int state);
/* System State Flags */
/* a generalization of the current state */
#define PM_SYSTEM_STATE_USABLE 0x00000001
#define PM_SYSTEM_STATE_SLEEP 0x00000002
#define PM_SYSTEM_STATE_OFF 0x00000004
/* where suspend data is being retained, if applicable */
#define PM_SYSTEM_STATE_SUSPEND_RAM 0x00000010
#define PM_SYSTEM_STATE_SUSPEND_DISK 0x00000020
/* the emphasize at a given state */
#define PM_SYSTEM_STATE_MAX_PERFORMANCE 0x00000100
#define PM_SYSTEM_STATE_MAX_BATTERY 0x00000200
#define PM_SYSTEM_STATE_BALANCED 0x00000400
/*
* Power Resources
*
* Power resources represent the physical requirements of "power devices".
* (e.g. clocks, power planes, etc.) Only define a power resource if your
* PM protocol allows specific control of the resource. If not, it can be
* a characteristic only represented by the device power state.
*/
struct power_resource_ops {
int (*update) (struct power_resource *power);
int (*on) (struct power_resource *power);
int (*off) (struct power_resource *power);
int (*available) (struct power_resource *power,
unsigned int system_state);
};
struct power_resource {
int enabled;
struct list_head deps;
struct power_resource_ops *ops;
struct power_device *domain;
unsigned int max_domain_state;
};
extern int pm_register_resource(struct power_resource *res);
extern void pm_unregister_resource(struct power_resource *res);
extern int pm_enable_resource(struct power_resource *res);
extern int pm_disable_resource(struct power_resource *res);
extern int pm_available_resource(struct power_resource *res,
unsigned int system_state);
/*
* Power States
*
* These are used to define device-specific power states.
*/
struct power_state {
char * name;
unsigned int state;
unsigned int flags;
unsigned int power_consumption; /* in mW */
unsigned int max_domain_state;
struct list_head state_list;
struct list_head deps;
};
extern void pm_add_state(struct power_device *dev, struct power_state *state);
extern void pm_remove_state(struct power_state *state);
#define PM_DEVICE_STATE_MAX_PERFORMANCE 0x00000001
#define PM_DEVICE_STATE_USABLE 0x00000002
#define PM_DEVICE_STATE_OFF 0x00000004
#define PM_DEVICE_STATE_MASK 0xffff0000 /* bus-specific values */
struct power_dependency {
struct power_resource * res;
struct list_head state_list;
struct list_head device_list;
};
extern struct power_dependency *
pm_alloc_dependency(struct power_resource * res);
extern void pm_add_dependency(struct power_state * state,
struct power_dependency * dep);
extern void pm_remove_dependency(struct power_dependency * dep);
/*
* Power Devices
*
* Power devices are the core building block of a system's power management
* topology. They may require power resources, but the primary dependency
* relationships are represented by a tree of "power devices". This tree
* is based on a power domain (or container) model.
*/
struct power_device {
struct kobject kobj;
unsigned int state;
unsigned int max_state;
struct list_head states;
struct list_head child_list;
struct list_head children;
struct power_device * domain;
struct device * dev;
struct power_driver * controller;
struct power_policy * policy;
void * policy_data;
int wake_enabled;
unsigned int max_wake_state;
struct list_head wake_deps;
};
extern int pm_register_device(struct power_device *power);
extern void pm_unregister_device(struct power_device *power);
extern unsigned int pm_get_state(struct power_device *power);
extern int pm_set_state(struct power_device *power, unsigned int state);
extern int pm_available_state(struct power_device *power, unsigned int state,
unsigned int system_state);
extern struct power_state *
pm_get_state_data(struct power_device *power, unsigned int state);
extern int pm_add_wake_dependency(struct power_device *power,
struct power_dependency *dep);
/*
* Power Drivers
*
* Power drivers provide information about a power device's current state and
* mechanisms for changing that state.
*/
struct power_driver {
char * name;
int (*attach) (struct power_device * power);
void (*detach) (struct power_device * power);
int (*update) (struct power_device * power);
int (*set) (struct power_device * power, unsigned int state);
int (*available) (struct power_device *power,
unsigned int state, unsigned int system_state);
};
extern int pm_bind_power_driver(struct power_device *power,
struct power_driver *drv);
extern void pm_unbind_power_driver(struct power_device *power);
/*
* Power Management Policy
*
* Makes power management related decisions on a per "power device" basis.
*/
struct power_policy {
(*apply) (struct power_device *power, unsigned int system_state);
};
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 2:46 [RFC] A New Power Management API Adam Belay
@ 2005-04-15 8:21 ` Benjamin Herrenschmidt
2005-04-15 13:16 ` Daniel Petrini
` (2 subsequent siblings)
3 siblings, 0 replies; 18+ messages in thread
From: Benjamin Herrenschmidt @ 2005-04-15 8:21 UTC (permalink / raw)
To: Adam Belay; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 458 bytes --]
On Thu, 2005-04-14 at 22:46 -0400, Adam Belay wrote:
> Hi all,
>
> I'm in the process of designing a new power management API. I tried to
> incorporate some of the ideas and suggestions made during past
> discussions. Also, I've been studying the requirements of the PCI and
> ACPI power specifications.
>
> .../...
A quick look -> looks good. I need to spend more time thinking on it
though to get a better idea of the implications in practice.
Ben.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 2:46 [RFC] A New Power Management API Adam Belay
2005-04-15 8:21 ` Benjamin Herrenschmidt
@ 2005-04-15 13:16 ` Daniel Petrini
2005-04-15 18:20 ` Adam Belay
2005-04-16 17:13 ` David Brownell
2005-04-15 15:50 ` Jordan Crouse
2005-04-16 17:27 ` David Brownell
3 siblings, 2 replies; 18+ messages in thread
From: Daniel Petrini @ 2005-04-15 13:16 UTC (permalink / raw)
To: Adam Belay; +Cc: Linux-pm mailing list
[-- Attachment #1.1: Type: text/plain, Size: 549 bytes --]
Hi,
Suggestion: as discussed in early weeks, why don't adopt names taking into
consideration that Linux can run in systems other than desktops and servers?
#define PM_SYSTEM_STATE_SUSPEND_RAM 0x00000010
#define PM_SYSTEM_STATE_SUSPEND_DISK 0x00000020
PM_SYSTEM_STATE_SUSPEND_DISK an be something like
PM_SYSTEM_STATE_DEEP_SUSPEND and so on...
So you can have better acceptance of people working in other areas.
My two cents.
Regards,
Daniel Petrini
--
10LE - Linux
Nokia Institute of Tecnology - INdT
Manaus - Brazil
[-- Attachment #1.2: Type: text/html, Size: 640 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 2:46 [RFC] A New Power Management API Adam Belay
2005-04-15 8:21 ` Benjamin Herrenschmidt
2005-04-15 13:16 ` Daniel Petrini
@ 2005-04-15 15:50 ` Jordan Crouse
2005-04-15 18:54 ` Adam Belay
2005-04-16 17:27 ` David Brownell
3 siblings, 1 reply; 18+ messages in thread
From: Jordan Crouse @ 2005-04-15 15:50 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1723 bytes --]
Adam -
Nice stuff. A few comments. First, a question:
Is it safe to assume that PMU devices (like these:
http://www.dialog-semiconductor.com/template.php?page_id=62) would fall
under the category of Power Resources? if so, we may need to consider
some more advanced hooks than just on/off.
Secondly, I have some comments regarding the system states, especially
the last three. While those three states are important, right off the
bat I can think of several scenarios where additional power states could
be imagined. I fear that if we hard code these states, then every
individual with a slightly different usage model on a given platform
would have to patch the kernel accordingly. This isn't such a far
fetched scenario on an embedded platform.
This is where I see the policy managers taking a greater role. In
addition to the hard coded states (the first five you defined sound fine
to me), the user would define a number of pseudo system states as well
as define the translation tables for device states under those pseudo
states. Each pseudo state would be registered with the PM core and
assigned a unique identifier.
We would then define a system state flag, say
PM_SYSTEM_STATE_PSEUDO, which would prompt a driver to query its policy
manager to get the translation for the right device state. Those
devices that don't understand or care about user defined power
states would simply ignore the flag.
I think that way we could keep things simple for for the laptop /
desktop folks while still providing a bit of flexibility on the embedded
side of things.
Regards,
Jordan
--
Jordan Crouse
Senior Linux Engineer
AMD - Personal Connectivity Solutions Group
<www.amd.com/embeddedprocessors>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 13:16 ` Daniel Petrini
@ 2005-04-15 18:20 ` Adam Belay
2005-04-16 17:13 ` David Brownell
1 sibling, 0 replies; 18+ messages in thread
From: Adam Belay @ 2005-04-15 18:20 UTC (permalink / raw)
To: Daniel Petrini; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 917 bytes --]
On Fri, 2005-04-15 at 09:16 -0400, Daniel Petrini wrote:
> Hi,
>
> Suggestion: as discussed in early weeks, why don't adopt names taking
> into consideration that Linux can run in systems other than desktops
> and servers?
>
> #define PM_SYSTEM_STATE_SUSPEND_RAM 0x00000010
> #define PM_SYSTEM_STATE_SUSPEND_DISK 0x00000020
>
> PM_SYSTEM_STATE_SUSPEND_DISK an be something like
> PM_SYSTEM_STATE_DEEP_SUSPEND and so on...
>
> So you can have better acceptance of people working in other areas.
>
> My two cents.
> Regards,
>
> Daniel Petrini
My idea here was to provide some optional flags that describe
characteristics of a given system-specific state. They may need to be
revised, but the intention was to provide meaningful information to
drivers without architecture specific code. Architecture-specific
drivers will likely ignore these and pay more attention to the state
number.
Thanks,
Adam
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 15:50 ` Jordan Crouse
@ 2005-04-15 18:54 ` Adam Belay
2005-04-16 2:53 ` Todd Poynor
2005-04-16 18:24 ` David Brownell
0 siblings, 2 replies; 18+ messages in thread
From: Adam Belay @ 2005-04-15 18:54 UTC (permalink / raw)
To: Jordan Crouse; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 3983 bytes --]
On Fri, 2005-04-15 at 09:50 -0600, Jordan Crouse wrote:
> Adam -
>
> Nice stuff. A few comments. First, a question:
>
> Is it safe to assume that PMU devices (like these:
> http://www.dialog-semiconductor.com/template.php?page_id=62) would fall
> under the category of Power Resources? if so, we may need to consider
> some more advanced hooks than just on/off.
I'm not familiar with how we would control such a device from software.
If you think more than _ON and _OFF would be necessary, I could
certainly revise this code. ACPI only supports _ON and _OFF in its
model. More complex stuff is handled by the state of the device.
---(power_resource)
| |
domain -> device
In my design, power resources are specific power planes and clocks that
belong to a given power domain (or that domain's parent). They often
depend on the state of the domain or the system state. A device may
require multiple power resources, but can only belong to one power
domain. I could see something like having a clock put at a slower
speed, so maybe we do need more than on and off.
> Secondly, I have some comments regarding the system states, especially
> the last three. While those three states are important, right off the
> bat I can think of several scenarios where additional power states could
> be imagined. I fear that if we hard code these states, then every
> individual with a slightly different usage model on a given platform
> would have to patch the kernel accordingly. This isn't such a far
> fetched scenario on an embedded platform.
I agree.
I'm not sure if I was very clear on this, but I intended for those
constants to be used as flags for describing characteristics of
platform-specific system states. On X86, ACPI would provide the system
states. So a suspend to ram state like ACPI S3 might be:
PM_SYSTEM_STATE_SLEEP & PM_SYSTEM_STATE_SUSPEND_RAM &
PM_SYSTEM_STATE_BALANCED
And S4 might be:
PM_SYSTEM_STATE_OFF & PM_SYSTEM_STATE_SUSPEND_DISK
I think these may need to be revised. I look forward to any
suggestions.
Power drivers will often have knowledge of the platform-specific states,
and they will inform the policy manager of what states/options are
available. The policy manger will then attempt to determine the
intentions of the system state, and after also factoring in user
settings, choose which state to use. Policy managers have the option of
considering the actual system state number if they're aware of it.
Otherwise, they will use the flags to guess what would be appropriate
for the system state.
>
> This is where I see the policy managers taking a greater role. In
> addition to the hard coded states (the first five you defined sound fine
> to me), the user would define a number of pseudo system states as well
> as define the translation tables for device states under those pseudo
> states. Each pseudo state would be registered with the PM core and
> assigned a unique identifier.
That's an interesting idea. How would the sysfs interface work? In
general, it's challenging to allow the user to define policy
configuration settings on a per-system-state basis.
>
> We would then define a system state flag, say
> PM_SYSTEM_STATE_PSEUDO, which would prompt a driver to query its policy
> manager to get the translation for the right device state. Those
> devices that don't understand or care about user defined power
> states would simply ignore the flag.
I think the policy manager would directly request for a change of the
state. Then the driver would handle *suspend. Although the policy
manger is code included with the driver, I'd like it to handle any
decisions. So it would be the policy manager's job to notice something
like PM_SYSTEM_STATE_PSEUDO.
>
> I think that way we could keep things simple for for the laptop /
> desktop folks while still providing a bit of flexibility on the embedded
> side of things.
>
> Regards,
> Jordan
I appreciate the comments.
Thanks,
Adam
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 18:54 ` Adam Belay
@ 2005-04-16 2:53 ` Todd Poynor
2005-04-16 19:26 ` David Brownell
2005-04-16 18:24 ` David Brownell
1 sibling, 1 reply; 18+ messages in thread
From: Todd Poynor @ 2005-04-16 2:53 UTC (permalink / raw)
To: Adam Belay; +Cc: linux-pm
Adam Belay wrote:
> On Fri, 2005-04-15 at 09:50 -0600, Jordan Crouse wrote:
...
>>This is where I see the policy managers taking a greater role. In
>>addition to the hard coded states (the first five you defined sound fine
>>to me), the user would define a number of pseudo system states as well
>>as define the translation tables for device states under those pseudo
>>states. Each pseudo state would be registered with the PM core and
>>assigned a unique identifier.
>
>
> That's an interesting idea. How would the sysfs interface work? In
> general, it's challenging to allow the user to define policy
> configuration settings on a per-system-state basis.
>
>
>>We would then define a system state flag, say
>>PM_SYSTEM_STATE_PSEUDO, which would prompt a driver to query its policy
>>manager to get the translation for the right device state. Those
>>devices that don't understand or care about user defined power
>>states would simply ignore the flag.
>
>
> I think the policy manager would directly request for a change of the
> state. Then the driver would handle *suspend. Although the policy
> manger is code included with the driver, I'd like it to handle any
> decisions. So it would be the policy manager's job to notice something
> like PM_SYSTEM_STATE_PSEUDO.
As a real-life embedded systems example, I have periodic conversations
with some folks who design Linux-based mobile phones, and they have
recently asked for the addition of features that I figure can be
addressed by approximately just this.
On the other hand, I've been trying to push them toward a simpler-kernel
model in which all the product-specific logic for placing various
devices in designated power states occurs in a userspace power policy
manager, and: (a) most drivers take no additional action at system
suspend, leaving the policy-manager-set power state in place, with some
exceptions for drivers necessary to run the system right up until sleep
time; (b) most drivers do nothing at system resume other than those
needed to get the system up and running (for a mobile phone, say, at
least mtd subsystem and flash chip driver) and then let userspace policy
manager decide what power state to place all the devices in. Their
products tend to do highly customized things like power back up a small
subset of devices that were on prior to the sleep, or briefly power up
previously-off devices to check to see if any important changes in the
environent occurred during the sleep, etc.
I think either way will work fine for what they're trying to do, am open
to discuss the merits of either approach. If anyone has any comments on
the simplistic kernel method it's appreciated. Thanks,
--
Todd
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 13:16 ` Daniel Petrini
2005-04-15 18:20 ` Adam Belay
@ 2005-04-16 17:13 ` David Brownell
2005-04-17 20:26 ` Adam Belay
1 sibling, 1 reply; 18+ messages in thread
From: David Brownell @ 2005-04-16 17:13 UTC (permalink / raw)
To: linux-pm, Daniel Petrini; +Cc: Adam Belay
[-- Attachment #1: Type: text/plain, Size: 893 bytes --]
On Friday 15 April 2005 6:16 am, Daniel Petrini wrote:
>
> Suggestion: as discussed in early weeks, why don't adopt names taking into
> consideration that Linux can run in systems other than desktops and servers?
>
> #define PM_SYSTEM_STATE_SUSPEND_RAM 0x00000010
> #define PM_SYSTEM_STATE_SUSPEND_DISK 0x00000020
The notion of RAM doesn't bother me much, although it may matter
whether that means "all system RAM is preserved" versus "only a
small bit of on-chip SDRAM is preserved".
The notion of DISK bothers me more. Systems with only FLASH/MTD
probably aren't more demanding of PM infrastructure, but when
folk assume that the only non-volatile storage is a disk, it may
start to seem that way.
For the moment, maybe using "NONVOLATILE" rather than DISK would
be good. Also, somewhat of a nit, those aren't exactly states,
so maybe take at least the _STATE_ thingie out.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 2:46 [RFC] A New Power Management API Adam Belay
` (2 preceding siblings ...)
2005-04-15 15:50 ` Jordan Crouse
@ 2005-04-16 17:27 ` David Brownell
2005-04-17 20:25 ` Adam Belay
3 siblings, 1 reply; 18+ messages in thread
From: David Brownell @ 2005-04-16 17:27 UTC (permalink / raw)
To: linux-pm; +Cc: Adam Belay
[-- Attachment #1: Type: text/plain, Size: 773 bytes --]
On Thursday 14 April 2005 7:46 pm, Adam Belay wrote:
> The new API uses a power container/domain model.
I like that as a basic organizing principle, and I don't think
anyone has problems with the notion that the power relationships
can't always map directly to the physical device tree.
Could you describe a bit about how the containers behave? For
example, when a device drops its power consumption, how does it
notify its container? And when sets of devices -- e.g. all USB
devices, all PCI devices -- support the same power states, how
will that code be shared? Would "struct power_device" be the
driver model replacement for "struct power"?
Also, I'd rather see "struct system_power_state *" everywhere
you're now passing "int system_state" in the API.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-15 18:54 ` Adam Belay
2005-04-16 2:53 ` Todd Poynor
@ 2005-04-16 18:24 ` David Brownell
2005-04-17 20:48 ` Adam Belay
1 sibling, 1 reply; 18+ messages in thread
From: David Brownell @ 2005-04-16 18:24 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 3294 bytes --]
On Friday 15 April 2005 11:54 am, Adam Belay wrote:
> On Fri, 2005-04-15 at 09:50 -0600, Jordan Crouse wrote:
> > Adam -
> >
> > Nice stuff. A few comments. First, a question:
> >
> > Is it safe to assume that PMU devices (like these:
> > http://www.dialog-semiconductor.com/template.php?page_id=62) would fall
> > under the category of Power Resources? if so, we may need to consider
> > some more advanced hooks than just on/off.
>
> I'm not familiar with how we would control such a device from software.
> If you think more than _ON and _OFF would be necessary, I could
> certainly revise this code.
Other similar devices include the TI TPS6501x series:
http://focus.ti.com/docs/prod/folders/print/tps65010.html
One of these moments I'll submit the driver I wrote for use with
OMAP; it's pretty minimalist just now, but tps65010.c is in BK at:
http://linux-omap.bkbits.net:8080/main/src/drivers/i2c/chips
Yes, more than "on" and "off" is necessary. Lots of drivers
build on that. Currently most of them just access the GPIOs
it provides, interact with battery recharging, or control
various aspects of system power state transitions. The hardware
defaults manage a "not-fast" battery recharge, so battery based
systems "just work".
That driver doesn't yet do anything like forcing a system suspend
when battery power goes dangerously low, or even feeding IRQs to
other drivers. Much less monitoring battery recharge or managing
fast recharge. That is, it's incomplete ... so all you will see
right now is stuff that's essential even for devel boards that
mostly run on AC power. No UI is supported yet.
Those Dialog devices have a slightly different functional mix.
Motorola also has similar parts, also used in cell phones.
One small point to notice: this is an I2C driver, but it needs
to initialize quite early on most boards, since it's used to power
up other devices (including often the DSP). So the system init
sequence has been adjusted to make that behave: I2C initializes
early (subsys_initcall), as does this specific I2C driver.
> ACPI only supports _ON and _OFF in its
> model. More complex stuff is handled by the state of the device.
>
> ---(power_resource)
> | |
> domain -> device
>
> In my design, power resources are specific power planes and clocks that
> belong to a given power domain (or that domain's parent). They often
> depend on the state of the domain or the system state. A device may
> require multiple power resources, but can only belong to one power
> domain.
But power domains must be able to belong to other power domains,
and the domains are themselves power devices... so there's a fair
amount of "inheritance" going on.
> I could see something like having a clock put at a slower
> speed, so maybe we do need more than on and off.
How would you handle the ARM clock trees? E.g. the stuff you'll
find in arch/arm/mach-omap/clock.c in 2.6.12-rc2? (Older versions
are very similar, the latest version in the linux-omap tree above
can be made to disable unused clocks at boot time.) My initial
thought was that these specific resources wouldn't show up as any
kind of power resource, since they're already managed properly
(by clk_use/clk_unuse) as drivers activate and de-activate.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-16 2:53 ` Todd Poynor
@ 2005-04-16 19:26 ` David Brownell
2005-04-19 3:09 ` Todd Poynor
0 siblings, 1 reply; 18+ messages in thread
From: David Brownell @ 2005-04-16 19:26 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 4255 bytes --]
On Friday 15 April 2005 7:53 pm, Todd Poynor wrote:
> Adam Belay wrote:
> > On Fri, 2005-04-15 at 09:50 -0600, Jordan Crouse wrote:
> ...
> >>We would then define a system state flag, say
> >>PM_SYSTEM_STATE_PSEUDO, which would prompt a driver to query its policy
> >>manager to get the translation for the right device state. Those
> >>devices that don't understand or care about user defined power
> >>states would simply ignore the flag.
> >
> > I think the policy manager would directly request for a change of the
> > state. Then the driver would handle *suspend. Although the policy
> > manger is code included with the driver, I'd like it to handle any
> > decisions. So it would be the policy manager's job to notice something
> > like PM_SYSTEM_STATE_PSEUDO.
>
> As a real-life embedded systems example, I have periodic conversations
> with some folks who design Linux-based mobile phones, and they have
> recently asked for the addition of features that I figure can be
> addressed by approximately just this.
I like that idea, except that I don't see why it should kick in
only with a PM_SYSTEM_STATE_PSUEDO flag. Why shouldn't the power
container's policy code _always_ work that way? That is, I see
no motivation for a special case here.
> On the other hand, I've been trying to push them toward a simpler-kernel
> model in which all the product-specific logic for placing various
> devices in designated power states occurs in a userspace power policy
> manager,
So maybe you have some answers for me about why there should need
to be *any* notion of exporting device power states. (Rather than
not caring, and just requiring driver-internal powers states always
to become consistent with the upcoming system power state.)
If one takes the notion of userspace managing those states off the table,
is there any other motivation to expose those device power states? If
not, the only reason to export power states would be to support this
particular style of userspace power management micro-policy.
I'm pretty sure that supporting this sort of userspace functionality
won't really fit into the "simpler kernel" rubric. If for no other
reason than the self-evident fact that a kernel exporting such stuff
must have more code than one not exporting it...
Right now the only reason I'm not strongly leaning towards just
removing the /sys/.../power/state files completely is that I still
want a way to test individual driver suspend and resume methods
without forcing some system sleep transition. It's a real pain
to be unable to test one each driver until all the others are
also working correctly; in fact, it's a major chicken/egg issue.
> and: (a) most drivers take no additional action at system
> suspend, leaving the policy-manager-set power state in place, with some
> exceptions for drivers necessary to run the system right up until sleep
> time; (b) most drivers do nothing at system resume other than those
> needed to get the system up and running (for a mobile phone, say, at
> least mtd subsystem and flash chip driver) and then let userspace policy
> manager decide what power state to place all the devices in. Their
> products tend to do highly customized things like power back up a small
> subset of devices that were on prior to the sleep, or briefly power up
> previously-off devices to check to see if any important changes in the
> environent occurred during the sleep, etc.
>
> I think either way will work fine for what they're trying to do, am open
> to discuss the merits of either approach. If anyone has any comments on
> the simplistic kernel method it's appreciated. Thanks,
I think it's more accurate to talk about that as a "userspace micro-policy
management" approach rather than a "simplistic kernel method". Simpler
kernels would export only broad-brush policy controls.
I see it as akin to the tradeoff between lots of fine grained locks, versus
a few coarse grained ones: except for the atypical highly-contended cases,
one coarse lock invariably provides the lowest overhead.
- Dave
>
> --
> Todd
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> http://lists.osdl.org/mailman/listinfo/linux-pm
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-16 17:27 ` David Brownell
@ 2005-04-17 20:25 ` Adam Belay
0 siblings, 0 replies; 18+ messages in thread
From: Adam Belay @ 2005-04-17 20:25 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1568 bytes --]
On Sat, 2005-04-16 at 10:27 -0700, David Brownell wrote:
> On Thursday 14 April 2005 7:46 pm, Adam Belay wrote:
>
> > The new API uses a power container/domain model.
>
> I like that as a basic organizing principle, and I don't think
> anyone has problems with the notion that the power relationships
> can't always map directly to the physical device tree.
>
> Could you describe a bit about how the containers behave? For
> example, when a device drops its power consumption, how does it
> notify its container? And when sets of devices -- e.g. all USB
> devices, all PCI devices -- support the same power states, how
> will that code be shared? Would "struct power_device" be the
> driver model replacement for "struct power"?
Here are my current thoughts:
When greatest common denominator of power requirements is lowered (as a
result of a power_device lowering its state), the policy manager of the
container "struct power_device" will be notified. Code for the same
states will not be shared, because the method of transitioning these
states is different. Also their meanings are often different. However,
if the states are the same, the mappings will likely be simplistic.
Every "struct device' will have a "struct power_device", but not every
"struct power_device" must have a "struct device". "struct
power_device" will be the base unit of power management topologies.
>
> Also, I'd rather see "struct system_power_state *" everywhere
> you're now passing "int system_state" in the API.
>
> - Dave
Fixed. Thanks for the suggestion.
Adam
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-16 17:13 ` David Brownell
@ 2005-04-17 20:26 ` Adam Belay
0 siblings, 0 replies; 18+ messages in thread
From: Adam Belay @ 2005-04-17 20:26 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1045 bytes --]
On Sat, 2005-04-16 at 10:13 -0700, David Brownell wrote:
> On Friday 15 April 2005 6:16 am, Daniel Petrini wrote:
> >
> > Suggestion: as discussed in early weeks, why don't adopt names taking into
> > consideration that Linux can run in systems other than desktops and servers?
> >
> > #define PM_SYSTEM_STATE_SUSPEND_RAM 0x00000010
> > #define PM_SYSTEM_STATE_SUSPEND_DISK 0x00000020
>
> The notion of RAM doesn't bother me much, although it may matter
> whether that means "all system RAM is preserved" versus "only a
> small bit of on-chip SDRAM is preserved".
>
> The notion of DISK bothers me more. Systems with only FLASH/MTD
> probably aren't more demanding of PM infrastructure, but when
> folk assume that the only non-volatile storage is a disk, it may
> start to seem that way.
>
> For the moment, maybe using "NONVOLATILE" rather than DISK would
> be good. Also, somewhat of a nit, those aren't exactly states,
> so maybe take at least the _STATE_ thingie out.
>
> - Dave
Yeah, I agree. I've changed this.
Thanks,
Adam
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-16 18:24 ` David Brownell
@ 2005-04-17 20:48 ` Adam Belay
2005-04-17 22:29 ` David Brownell
0 siblings, 1 reply; 18+ messages in thread
From: Adam Belay @ 2005-04-17 20:48 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 4810 bytes --]
On Sat, 2005-04-16 at 11:24 -0700, David Brownell wrote:
> On Friday 15 April 2005 11:54 am, Adam Belay wrote:
> > On Fri, 2005-04-15 at 09:50 -0600, Jordan Crouse wrote:
> > > Adam -
> > >
> > > Nice stuff. A few comments. First, a question:
> > >
> > > Is it safe to assume that PMU devices (like these:
> > > http://www.dialog-semiconductor.com/template.php?page_id=62) would fall
> > > under the category of Power Resources? if so, we may need to consider
> > > some more advanced hooks than just on/off.
> >
> > I'm not familiar with how we would control such a device from software.
> > If you think more than _ON and _OFF would be necessary, I could
> > certainly revise this code.
>
> Other similar devices include the TI TPS6501x series:
>
> http://focus.ti.com/docs/prod/folders/print/tps65010.html
>
> One of these moments I'll submit the driver I wrote for use with
> OMAP; it's pretty minimalist just now, but tps65010.c is in BK at:
>
> http://linux-omap.bkbits.net:8080/main/src/drivers/i2c/chips
>
> Yes, more than "on" and "off" is necessary. Lots of drivers
> build on that. Currently most of them just access the GPIOs
> it provides, interact with battery recharging, or control
> various aspects of system power state transitions. The hardware
> defaults manage a "not-fast" battery recharge, so battery based
> systems "just work".
Ok, so I think each power plane (or LDO or whatever) would be a power
resource, not the controller itself.
>
> That driver doesn't yet do anything like forcing a system suspend
> when battery power goes dangerously low, or even feeding IRQs to
> other drivers. Much less monitoring battery recharge or managing
> fast recharge. That is, it's incomplete ... so all you will see
> right now is stuff that's essential even for devel boards that
> mostly run on AC power. No UI is supported yet.
I've looked through the code. I appreciate the information.
>
> Those Dialog devices have a slightly different functional mix.
> Motorola also has similar parts, also used in cell phones.
>
> One small point to notice: this is an I2C driver, but it needs
> to initialize quite early on most boards, since it's used to power
> up other devices (including often the DSP). So the system init
> sequence has been adjusted to make that behave: I2C initializes
> early (subsys_initcall), as does this specific I2C driver.
Yeah, sounds like that could get rather tricky.
>
>
> > ACPI only supports _ON and _OFF in its
> > model. More complex stuff is handled by the state of the device.
> >
> > ---(power_resource)
> > | |
> > domain -> device
> >
> > In my design, power resources are specific power planes and clocks that
> > belong to a given power domain (or that domain's parent). They often
> > depend on the state of the domain or the system state. A device may
> > require multiple power resources, but can only belong to one power
> > domain.
>
> But power domains must be able to belong to other power domains,
> and the domains are themselves power devices... so there's a fair
> amount of "inheritance" going on.
Correct "power devices" use an inheritance model. "power resources" can
be independent of the inheritance and belong to the global system pool.
I think this can cover most platforms.
>
>
> > I could see something like having a clock put at a slower
> > speed, so maybe we do need more than on and off.
>
> How would you handle the ARM clock trees? E.g. the stuff you'll
> find in arch/arm/mach-omap/clock.c in 2.6.12-rc2? (Older versions
> are very similar, the latest version in the linux-omap tree above
> can be made to disable unused clocks at boot time.) My initial
> thought was that these specific resources wouldn't show up as any
> kind of power resource, since they're already managed properly
> (by clk_use/clk_unuse) as drivers activate and de-activate.
So I think each clock could be a power resource. It looks like the
current code isn't doing much more than turning them on and off. I'm
curious about changing the frequency. In what cases might this happen?
Would it be related to power state of the device, or could the frequency
be changed without changing the state? Is the frequency a specific
requirement of the device that isn't intended to be changed at all?
I think adding more than on and off might get to be too complicated, but
clearly every power resource we're interested in has an on/off
capability. If we really need this sort of functionality to be handled
by the power management subsystem, then maybe we could have a "struct
power_clock" with a "struct power_resource" inside of it? Maybe the
drivers should be handling these sort of things on their own? I'm
interested in how drivers will be interacting with this.
Thanks,
Adam
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-17 20:48 ` Adam Belay
@ 2005-04-17 22:29 ` David Brownell
2005-04-17 23:01 ` Adam Belay
0 siblings, 1 reply; 18+ messages in thread
From: David Brownell @ 2005-04-17 22:29 UTC (permalink / raw)
To: Adam Belay; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 6478 bytes --]
On Sunday 17 April 2005 1:48 pm, Adam Belay wrote:
> On Sat, 2005-04-16 at 11:24 -0700, David Brownell wrote:
> > On Friday 15 April 2005 11:54 am, Adam Belay wrote:
> > >
> > > I'm not familiar with how we would control such a device from software.
> > > If you think more than _ON and _OFF would be necessary, I could
> > > certainly revise this code.
> >
> > Other similar devices include the TI TPS6501x series:
> > ...
> > Yes, more than "on" and "off" is necessary. ...
>
> Ok, so I think each power plane (or LDO or whatever) would be a power
> resource, not the controller itself.
Maybe. I suspect packaging it that way would seem like "extra"
work, but I won't pretend to know all the right answers for how
to talk to such chips. One person asking me how to handle a
similar chip on a Motorola platform told me he expected to get
more than a few arrows in his back ... and I was kind of worried
about starting a new drivers/i2c/power subtree! ;)
What I've seen so far is that the LDO setup is just board-specific
init logic of the "write once and forget" variety. Which means
that software basically ignores it later ... although it's possible
that some LDOs might be adjusted to lower voltages during certain
system states.
Likewise with GPIOs used to switch power supplied through the LDOs:
board-specific logic kicking when the various drivers need to do
their thing.
> > One small point to notice: this is an I2C driver, but it needs
> > to initialize quite early on most boards, since it's used to power
> > up other devices (including often the DSP). So the system init
> > sequence has been adjusted to make that behave: I2C initializes
> > early (subsys_initcall), as does this specific I2C driver.
>
> Yeah, sounds like that could get rather tricky.
Not really, but I thought it was worth pointing out. It's an example
of a case where PC-ish assumptions can cause surprises. (Another
would be assuming PM registers can be accessed in IRQ context... or
that ACPI handles any of that!)
> > > I could see something like having a clock put at a slower
> > > speed, so maybe we do need more than on and off.
> >
> > How would you handle the ARM clock trees? E.g. the stuff you'll
> > find in arch/arm/mach-omap/clock.c in 2.6.12-rc2? (Older versions
> > are very similar, the latest version in the linux-omap tree above
> > can be made to disable unused clocks at boot time.) My initial
> > thought was that these specific resources wouldn't show up as any
> > kind of power resource, since they're already managed properly
> > (by clk_use/clk_unuse) as drivers activate and de-activate.
>
> So I think each clock could be a power resource. It looks like the
> current code isn't doing much more than turning them on and off.
They should clk_use()/clk_unuse(), and automatically handle the
activation/deactivation of parent clocks. Some clocks are shared
between multiple devices ... so "on/off" doesn't suffice, those
devices' drivers rely on the clock framework to coordinate. (Or
were you thinking that on/off would match use/unuse? There's no
point to separate enable/disable calls IMO.)
If there were strong advantages to fitting those into a new resource
framework, that could surely be done. I think any driver actually
touching a clock will be platform-specific, but there's not much
point to having different clock APIs on every platform. (Other
than experimenting, which could be useful for a while yet.)
> I'm
> curious about changing the frequency. In what cases might this happen?
One example I noticed recently: the driver for an audio codec needed
to change the frequency driving that chip. That's for a clock that goes
off-chip. Likewise setting clocks used to drive LCD controllers at a
specific speed; and in another case, setting the clock to match what
was used on an internal bus. Again this mostly seems like init-and-forget.
There's also cpufreq style stuff. If someone implements that for OMAP
it'll need to account for a variety of constraints. For now it seems
that just idling the CPU -- or putting it into big sleep -- is enough
of a win for power saving that cpufreq may not be much wanted.
> Would it be related to power state of the device, or could the frequency
> be changed without changing the state?
Frequency affects power consumption in device-specific ways, but so
far the focus in Linux is just to affect it by ensuring that clocks
are off unless they're needed. That includes enabling hardware clock
gating support. (OMAP automates clock gating a lot more than other chips
I've had occasion to look at.) I suspect _most_ boards expect a single
frequency for each device (plus zero-Hertz).
> Is the frequency a specific
> requirement of the device that isn't intended to be changed at all?
That's specific to the device and its driver. There are lots of drivers
that don't much care about details of frequencies ... only that clocks
fed to their various busses and bridges are all in "safe" ratios. And
there are sometimes also external constraints, e.g. with that codec or
with USB.
> I think adding more than on and off might get to be too complicated, but
> clearly every power resource we're interested in has an on/off
> capability. If we really need this sort of functionality to be handled
> by the power management subsystem, then maybe we could have a "struct
> power_clock" with a "struct power_resource" inside of it? Maybe the
> drivers should be handling these sort of things on their own? I'm
> interested in how drivers will be interacting with this.
Me too. I think ARM has the best-worked Linux clock infrastructure
just now, and the model there is that drivers manage their clocks
explicitly and without any formal platform_device.resource entries.
(Should power resources be treated like any other resources?)
That's one reason I pointed you at OMAP's clock code. It's working
reasonably well right now, and it's a case where simple APIs to clock
trees won't work well ... e.g. PXA, or AT91, chips have really simple
clock trees. More like mach-s3c2410/clock.c in 2.6.current.
OMAP's is complex (deep hierarchy, autogating, very flexible) and is
also chip-specific at the detail level. For example, the USB host
clock is set up differently on 1510/590 vs later chips; and those
later ones have two different MMC controllers, independently clocked.
So it's a rich enough real-world-Linux example to chew on a while. :)
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-17 22:29 ` David Brownell
@ 2005-04-17 23:01 ` Adam Belay
0 siblings, 0 replies; 18+ messages in thread
From: Adam Belay @ 2005-04-17 23:01 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 897 bytes --]
On Sun, Apr 17, 2005 at 03:29:17PM -0700, David Brownell wrote:
> On Sunday 17 April 2005 1:48 pm, Adam Belay wrote:
> >
> > So I think each clock could be a power resource. It looks like the
> > current code isn't doing much more than turning them on and off.
>
> They should clk_use()/clk_unuse(), and automatically handle the
> activation/deactivation of parent clocks. Some clocks are shared
> between multiple devices ... so "on/off" doesn't suffice, those
> devices' drivers rely on the clock framework to coordinate. (Or
> were you thinking that on/off would match use/unuse? There's no
> point to separate enable/disable calls IMO.)
Yes, I was thinking power resources could replace use/unuse. So each device
could associate a list of required resources by state. Then, when there
are not any devices in a state that requires a given resource, the resource
could be turned off.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-16 19:26 ` David Brownell
@ 2005-04-19 3:09 ` Todd Poynor
2005-05-08 19:05 ` David Brownell
0 siblings, 1 reply; 18+ messages in thread
From: Todd Poynor @ 2005-04-19 3:09 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm
David Brownell wrote:
> On Friday 15 April 2005 7:53 pm, Todd Poynor wrote:
>...
>>On the other hand, I've been trying to push them toward a simpler-kernel
>>model in which all the product-specific logic for placing various
>>devices in designated power states occurs in a userspace power policy
>>manager,
>
> So maybe you have some answers for me about why there should need
> to be *any* notion of exporting device power states. (Rather than
> not caring, and just requiring driver-internal powers states always
> to become consistent with the upcoming system power state.)
>
> If one takes the notion of userspace managing those states off the table,
> is there any other motivation to expose those device power states?
So far as I have been able to determine, the usage you describe, as well
as straightforward individual device power state management (power this
device off upon receipt of some event, period of inactivity, etc.) is
common, although with enough flexibility in creating custom system power
states and reacting to those states in driver power controller objects,
the second usage may be covered by the first. I believe in some cases
power policy management applications are also examining device power
state to make policy decisions, but I certainly recommend use of
kernel-to-userspace notification of power events over such polling. If
I recall any other more exotic things I've run across I'll post a
response later.
> I'm pretty sure that supporting this sort of userspace functionality
> won't really fit into the "simpler kernel" rubric. If for no other
> reason than the self-evident fact that a kernel exporting such stuff
> must have more code than one not exporting it...
It may not be simpler overall on the kernel side (at least prior to
being customized for a particular policy, for which the in-kernel
alternative does add some amount of code). It does, however, place the
code implementing custom system power states and associated decisions on
device power states (which might fall under the general category of
"policy") in userspace and exports just enough kernel interfaces for
controlling system and device power behavior to let userspace handle it.
As I say, though, the userspace vs. kernel home for this isn't a
matter of strong preference to me, but have suggested userspace device
state management as arguably more in keeping with prevailing winds of
kernel development.
I don't mean to get hung up in the terminology, but I wouldn't call
relatively infrequent changes to fairly large-grained device power
states "micro-policy management" either -- there is plenty of room for
more frequent, possibly finer grained, changes to device power
characteristics in driver code based on device state or hardware
capabilities (auto clock gating, power up/down according to app use...),
the aggressiveness of which may also be a policy consideration to be
configured, perhaps from userspace (at least MacOS seems to have such
interfaces).
--
Todd
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC] A New Power Management API
2005-04-19 3:09 ` Todd Poynor
@ 2005-05-08 19:05 ` David Brownell
0 siblings, 0 replies; 18+ messages in thread
From: David Brownell @ 2005-05-08 19:05 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 5415 bytes --]
On Monday 18 April 2005 8:09 pm, Todd Poynor wrote:
> David Brownell wrote:
> > On Friday 15 April 2005 7:53 pm, Todd Poynor wrote:
> >...
> >>On the other hand, I've been trying to push them toward a simpler-kernel
> >>model in which all the product-specific logic for placing various
> >>devices in designated power states occurs in a userspace power policy
> >>manager,
Which, as I've commented, doesn't necessarily seem simple to me!
Userspace policy code could just as well be ALL device-specific,
like "turn off backlight", without needing any generic notion of
power states. Or, something like MontaVista DPM could work with
no need to expose device-specific power states... just the ability
to set some "operating point", which in ACPI terms might be one
of the many possible S0 states.
> > So maybe you have some answers for me about why there should need
> > to be *any* notion of exporting device power states. (Rather than
> > not caring, and just requiring driver-internal powers states always
> > to become consistent with the upcoming system power state.)
> >
> > If one takes the notion of userspace managing those states off the table,
> > is there any other motivation to expose those device power states?
>
> So far as I have been able to determine, the usage you describe, as well
> as straightforward individual device power state management (power this
> device off upon receipt of some event, period of inactivity, etc.) is
> common, although with enough flexibility in creating custom system power
> states and reacting to those states in driver power controller objects,
> the second usage may be covered by the first.
I can't quite parse this as an answer to my question though. Driver-internal
states are by definition not exported, and can't be "covered" by userspace.
The userspace knobs I think everyone pretty much agrees on are the sort
that are already widely used in X11, hdparm, and so on: idle timeouts,
with the system probably setting a sane default. Ones that are less
widely agreed on include having "system power states" that are more
specific/custom than just ACPI S1/S3/S4 or analogues ... because they
could include things like S0 states with backlight on vs off, or so on
for other devices. (That does beg the question of which states need to
be known as such to the kernel, and how... via /sys/power/state etc.)
The question was about whether "device power states" have any role
OTHER than to be managed by userspace. You didn't identify any such
role, so I'll continue to believe there is none.
> > I'm pretty sure that supporting this sort of userspace functionality
> > won't really fit into the "simpler kernel" rubric. If for no other
> > reason than the self-evident fact that a kernel exporting such stuff
> > must have more code than one not exporting it...
>
> It may not be simpler overall on the kernel side (at least prior to
> being customized for a particular policy, for which the in-kernel
> alternative does add some amount of code).
Depends on what what you expect out of a policy. Separately, I've
talked about how drivers can often be smart enough to stay in low
power modes "all the time" when they're idle. Such drivers don't
need ANY other policy inputs. If more drivers do that, the system
as a whole just needs to be smart enough to tell (possibly from
userspace) when the devices are idle ... and maybe to conclude, as
with the dynamic tick or VST stuff, that it's time to enter some
more power-efficient system state.
To concoct a DPM-like example, if one were to write the name describing
some operating point to some file in sysfs, and that ended up calling a
board-specific routine to suspend/resume various devices, maybe updating
clocks and power supplies, that would be straightforward and obvious.
Which makes it simpler than threading all that stuff throughout the
driver model to support doing that from userspace. :)
> It does, however, place the
> code implementing custom system power states and associated decisions on
> device power states (which might fall under the general category of
> "policy") in userspace and exports just enough kernel interfaces for
> controlling system and device power behavior to let userspace handle it.
Again, you're not answering the question of what point there'd to
expose any notion of driver state _other_ than to export that sort
of control knob to user-mode agents ... along with more widely used
things like idle timeout settings and backlight on/off controls,
or less widely accepted DPM-ish ones that effectively provide new
system power states ("operating points").
> As I say, though, the userspace vs. kernel home for this isn't a
> matter of strong preference to me, but have suggested userspace device
> state management as arguably more in keeping with prevailing winds of
> kernel development.
There are lots of winds blowing, and I don't think you can really say
that there's a wind blowing in favor of a particular type of feature
which has never yet been really usable... if anything, the wind has
always blown in favor of removing broken or unusable capabilities.
It's a question of what sort of policy choice the kernel has any
real business exporting. I've seen lots of interfaces designed to
export low level control knobs that really shouldn't be exported.
This still looks to me like yet another one...
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2005-05-08 19:05 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-15 2:46 [RFC] A New Power Management API Adam Belay
2005-04-15 8:21 ` Benjamin Herrenschmidt
2005-04-15 13:16 ` Daniel Petrini
2005-04-15 18:20 ` Adam Belay
2005-04-16 17:13 ` David Brownell
2005-04-17 20:26 ` Adam Belay
2005-04-15 15:50 ` Jordan Crouse
2005-04-15 18:54 ` Adam Belay
2005-04-16 2:53 ` Todd Poynor
2005-04-16 19:26 ` David Brownell
2005-04-19 3:09 ` Todd Poynor
2005-05-08 19:05 ` David Brownell
2005-04-16 18:24 ` David Brownell
2005-04-17 20:48 ` Adam Belay
2005-04-17 22:29 ` David Brownell
2005-04-17 23:01 ` Adam Belay
2005-04-16 17:27 ` David Brownell
2005-04-17 20:25 ` Adam Belay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox