* Runtime device power management in userspace
@ 2005-12-23 14:30 Holger Macht
2005-12-23 15:12 ` Patrick Mochel
2005-12-23 15:17 ` Alan Stern
0 siblings, 2 replies; 21+ messages in thread
From: Holger Macht @ 2005-12-23 14:30 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1479 bytes --]
Hi,
We implemented device runtime power management in a userspace application
(the powersave daemon). In this specific case, it means to successively
put pci devices into D3 powersave mode with writing a numerical '3' to the
corresponding power state file.
There are two main reasons for us to even doing this:
1. At first, the obvious reason. As mentioned in our research regarding
power consumption on this list, there is a very huge potential to
save battery power.
2. Due to the fact that this is AFAIK a heavily untested area, as a side
effect, we like to get reports about broken modules/drivers and maybe
get them fixed.
It is explicitly ment as an expert option and will be handled in a very
special way in our frontends. The normal user will be warned loud enough
before he can enable this "feature" ;-) We are also seriously aware of the
fact that all this should be done in kernel space one day.
Before starting implementation, we tested on different machines with
different hardware and got surprisingly few problems. The only module we
are at the moment aware of that it is broken, is e100. Sure, there might
be more...
Now that we get more and more votes that this could be very dangerous and
problematic because of the current kernel implementation, we decided to
gather some more comments from involved people and experts in this
area. That's the aim of my post to this list. Opinions are welcome!
Thanks in advance,
Holger
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-23 14:30 Runtime device power management in userspace Holger Macht
@ 2005-12-23 15:12 ` Patrick Mochel
2005-12-24 0:40 ` Pavel Machek
2005-12-24 15:31 ` Holger Macht
2005-12-23 15:17 ` Alan Stern
1 sibling, 2 replies; 21+ messages in thread
From: Patrick Mochel @ 2005-12-23 15:12 UTC (permalink / raw)
To: Holger Macht; +Cc: linux-pm
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1643 bytes --]
On Fri, 23 Dec 2005, Holger Macht wrote:
> Hi,
>
> We implemented device runtime power management in a userspace application
> (the powersave daemon). In this specific case, it means to successively
> put pci devices into D3 powersave mode with writing a numerical '3' to the
> corresponding power state file.
>
> There are two main reasons for us to even doing this:
>
> 1. At first, the obvious reason. As mentioned in our research regarding
> power consumption on this list, there is a very huge potential to
> save battery power.
>
> 2. Due to the fact that this is AFAIK a heavily untested area, as a side
> effect, we like to get reports about broken modules/drivers and maybe
> get them fixed.
That's great!
Please note that D3 is only relevant for PCI devices and for ACPI devices.
The fact that it's the same value for every device in the system is a
design flaw. Please be aware that the value to write to the device file
could change, and will be dependent on the type (bus) of device, and quite
possibly on the device itself. It may not even be '3' for all PCI devices
in the future, or may be a string rather than 1 character, or simply a '1'
into a different file.
Please also note that D3 is not always a good choice. A driver may not be
able to reinitialize the device from D3. And, since it takes longer to
resume from D3, you may want to start with D1 or D2. (The same concept is
true for devices other than PCI, though the values will be different.)
How do you determine the idleness of a device? Or, is it based purely on
user direction?
Also, is there source available?
Thanks,
Patrick
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-23 14:30 Runtime device power management in userspace Holger Macht
2005-12-23 15:12 ` Patrick Mochel
@ 2005-12-23 15:17 ` Alan Stern
2005-12-24 0:41 ` Pavel Machek
2005-12-24 0:43 ` Pavel Machek
1 sibling, 2 replies; 21+ messages in thread
From: Alan Stern @ 2005-12-23 15:17 UTC (permalink / raw)
To: Holger Macht; +Cc: linux-pm
[-- Attachment #1: Type: TEXT/PLAIN, Size: 727 bytes --]
On Fri, 23 Dec 2005, Holger Macht wrote:
> Hi,
>
> We implemented device runtime power management in a userspace application
> (the powersave daemon). In this specific case, it means to successively
> put pci devices into D3 powersave mode with writing a numerical '3' to the
> corresponding power state file.
> Now that we get more and more votes that this could be very dangerous and
> problematic because of the current kernel implementation, we decided to
> gather some more comments from involved people and experts in this
> area. That's the aim of my post to this list. Opinions are welcome!
Remember that you're likely to run into trouble if you try to suspend a
device that has unsuspended children.
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-23 15:12 ` Patrick Mochel
@ 2005-12-24 0:40 ` Pavel Machek
2005-12-26 20:43 ` Patrick Mochel
2005-12-24 15:31 ` Holger Macht
1 sibling, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2005-12-24 0:40 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm
On Pá 23-12-05 07:12:35, Patrick Mochel wrote:
> On Fri, 23 Dec 2005, Holger Macht wrote:
> > We implemented device runtime power management in a userspace application
> > (the powersave daemon). In this specific case, it means to successively
> > put pci devices into D3 powersave mode with writing a numerical '3' to the
> > corresponding power state file.
> >
> > There are two main reasons for us to even doing this:
> >
> > 1. At first, the obvious reason. As mentioned in our research regarding
> > power consumption on this list, there is a very huge potential to
> > save battery power.
> >
> > 2. Due to the fact that this is AFAIK a heavily untested area, as a side
> > effect, we like to get reports about broken modules/drivers and maybe
> > get them fixed.
>
> That's great!
>
> Please note that D3 is only relevant for PCI devices and for ACPI devices.
> The fact that it's the same value for every device in the system is a
> design flaw. Please be aware that the value to write to the device file
> could change, and will be dependent on the type (bus) of device, and quite
> possibly on the device itself. It may not even be '3' for all PCI devices
> in the future, or may be a string rather than 1 character, or simply a '1'
> into a different file.
Is there enough locking in driver core to make this safe?
Ouch and BTW *right* value is _2_, today... because state_store() uses
value from user as a pm_message_t.event. We should at least supply
some hint that it is runtime suspend in pm_message_t.flags --
unfortunately we do not have pm_message_t.flags yet.
Something like this is needed -- userspace should not play with
pm_message_t.state directly. [not even compile tested, have to sleep.]
I guess we should also audit the drivers... It needs locking against
user supplying requests concurently.
> Also, is there source available?
Should be, we are trying to do *Open*SUSE these days ;-))).
Pavel
diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c
--- a/drivers/base/power/sysfs.c
+++ b/drivers/base/power/sysfs.c
@@ -34,12 +34,14 @@ static ssize_t state_store(struct device
{
pm_message_t state;
char * rest;
- int error = 0;
+ int error = 0, i;
- state.event = simple_strtoul(buf, &rest, 10);
+ i = simple_strtoul(buf, &rest, 10);
+ state.event = PM_EVENT_SUSPEND;
+ state.flags = PM_FLAGS_RUNTIME;
if (*rest)
return -EINVAL;
- if (state.event)
+ if (i)
error = dpm_runtime_suspend(dev, state);
else
dpm_runtime_resume(dev);
diff --git a/include/linux/pm.h b/include/linux/pm.h
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -140,6 +140,7 @@ struct device;
typedef struct pm_message {
int event;
+ int flags;
} pm_message_t;
/*
@@ -165,6 +166,8 @@ typedef struct pm_message {
#define PM_EVENT_FREEZE 1
#define PM_EVENT_SUSPEND 2
+#define PM_FLAGS_RUNTIME 1
+
#define PMSG_FREEZE ((struct pm_message){ .event = PM_EVENT_FREEZE, })
#define PMSG_SUSPEND ((struct pm_message){ .event = PM_EVENT_SUSPEND, })
#define PMSG_ON ((struct pm_message){ .event = PM_EVENT_ON, })
--
Thanks, Sharp!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-23 15:17 ` Alan Stern
@ 2005-12-24 0:41 ` Pavel Machek
2005-12-24 0:43 ` Pavel Machek
1 sibling, 0 replies; 21+ messages in thread
From: Pavel Machek @ 2005-12-24 0:41 UTC (permalink / raw)
To: Alan Stern; +Cc: linux-pm
On Pá 23-12-05 10:17:45, Alan Stern wrote:
> On Fri, 23 Dec 2005, Holger Macht wrote:
>
> > Hi,
> >
> > We implemented device runtime power management in a userspace application
> > (the powersave daemon). In this specific case, it means to successively
> > put pci devices into D3 powersave mode with writing a numerical '3' to the
> > corresponding power state file.
>
> > Now that we get more and more votes that this could be very dangerous and
> > problematic because of the current kernel implementation, we decided to
> > gather some more comments from involved people and experts in this
> > area. That's the aim of my post to this list. Opinions are welcome!
>
> Remember that you're likely to run into trouble if you try to suspend a
> device that has unsuspended children.
I guess we only want to suspend, leaf PCI devices for now. There's
where the lowest-hanging fruit is.
Pavel
--
Thanks, Sharp!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-23 15:17 ` Alan Stern
2005-12-24 0:41 ` Pavel Machek
@ 2005-12-24 0:43 ` Pavel Machek
1 sibling, 0 replies; 21+ messages in thread
From: Pavel Machek @ 2005-12-24 0:43 UTC (permalink / raw)
To: Alan Stern; +Cc: linux-pm
On Pá 23-12-05 10:17:45, Alan Stern wrote:
> On Fri, 23 Dec 2005, Holger Macht wrote:
>
> > Hi,
> >
> > We implemented device runtime power management in a userspace application
> > (the powersave daemon). In this specific case, it means to successively
> > put pci devices into D3 powersave mode with writing a numerical '3' to the
> > corresponding power state file.
>
> > Now that we get more and more votes that this could be very dangerous and
> > problematic because of the current kernel implementation, we decided to
> > gather some more comments from involved people and experts in this
> > area. That's the aim of my post to this list. Opinions are welcome!
>
> Remember that you're likely to run into trouble if you try to suspend a
> device that has unsuspended children.
...network drivers should be easy targets -- we can copy suspend
handling from _close() routine. OTOH for example b44.c does
have quite different code between _close() and _suspend()...
Pavel
--
Thanks, Sharp!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-23 15:12 ` Patrick Mochel
2005-12-24 0:40 ` Pavel Machek
@ 2005-12-24 15:31 ` Holger Macht
2005-12-26 20:58 ` Patrick Mochel
1 sibling, 1 reply; 21+ messages in thread
From: Holger Macht @ 2005-12-24 15:31 UTC (permalink / raw)
To: linux-pm
[-- Attachment #1: Type: text/plain, Size: 3535 bytes --]
On Fr 23. Dez - 07:12:35, Patrick Mochel wrote:
>
> On Fri, 23 Dec 2005, Holger Macht wrote:
>
> > Hi,
> >
> > We implemented device runtime power management in a userspace application
> > (the powersave daemon). In this specific case, it means to successively
> > put pci devices into D3 powersave mode with writing a numerical '3' to the
> > corresponding power state file.
> >
> > There are two main reasons for us to even doing this:
> >
> > 1. At first, the obvious reason. As mentioned in our research regarding
> > power consumption on this list, there is a very huge potential to
> > save battery power.
> >
> > 2. Due to the fact that this is AFAIK a heavily untested area, as a side
> > effect, we like to get reports about broken modules/drivers and maybe
> > get them fixed.
>
> That's great!
>
> Please note that D3 is only relevant for PCI devices and for ACPI devices.
> The fact that it's the same value for every device in the system is a
> design flaw. Please be aware that the value to write to the device file
> could change, and will be dependent on the type (bus) of device, and quite
> possibly on the device itself. It may not even be '3' for all PCI devices
> in the future, or may be a string rather than 1 character, or simply a '1'
> into a different file.
It should be no problem to adjust that if the kernel interface changes.
> Please also note that D3 is not always a good choice. A driver may not be
> able to reinitialize the device from D3. And, since it takes longer to
> resume from D3, you may want to start with D1 or D2. (The same concept is
> true for devices other than PCI, though the values will be different.)
We only played around with D3 until now. The main implementation is done,
but we are still at a very early stage. So there is much room for
improvements...
> How do you determine the idleness of a device? Or, is it based purely on
> user direction?
That's one of the main problems. For some devices, for instance network
cards, we can query the NetworkManager (as far as available) via DBus if
the device is currently in use or not. But we don't have that possiblity
for all devices. The even more problematic point is the other way
round. How to figure out if we have to resume a device as soon as another
application wants to access it? I have currently no clue how to do this.
Thus, at the moment, the suspend and resume triggers are based on user
direction. We introduced a new so called powersave scheme
'AdvancedPowersave' to configure and handle all this experimental
stuff. Switching automatically to this scheme is possible, of course. But
not by default and it is too early to do more magic like powering down
devices when they are idle and resuming them as soon as they are
needed. But the new scheme will show up in our main client (kpowersave)
and most of the runtime power management options can be controlled from
there. Event without the automatic handling, we hope to reach some people,
or at least some power users, who want to give it a try despite of the
warnings we will generate.
> Also, is there source available?
Of course, there is. Tarballs [1] are available at sourceforge. You can
directly browse the code [2] from the svn webinterface. The corresponding
source files are device_management.{cpp,h} and device.{cpp,h}.
>
> Thanks,
>
>
> Patrick
>
Regards,
Holger
[1] https://sourceforge.net/project/showfiles.php?group_id=124576
[2] http://forge.novell.com/modules/xfmod/svn/svnbrowse.php?repname=powersave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-24 0:40 ` Pavel Machek
@ 2005-12-26 20:43 ` Patrick Mochel
2005-12-26 22:33 ` Pavel Machek
2005-12-26 22:47 ` Alan Stern
0 siblings, 2 replies; 21+ messages in thread
From: Patrick Mochel @ 2005-12-26 20:43 UTC (permalink / raw)
To: Pavel Machek; +Cc: linux-pm
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 2236 bytes --]
On Sat, 24 Dec 2005, Pavel Machek wrote:
> On Pá 23-12-05 07:12:35, Patrick Mochel wrote:
> > Please note that D3 is only relevant for PCI devices and for ACPI devices.
> > The fact that it's the same value for every device in the system is a
> > design flaw. Please be aware that the value to write to the device file
> > could change, and will be dependent on the type (bus) of device, and quite
> > possibly on the device itself. It may not even be '3' for all PCI devices
> > in the future, or may be a string rather than 1 character, or simply a '1'
> > into a different file.
>
> Is there enough locking in driver core to make this safe?
The issue I stated is actually orthogonal to locking the device; but the
answer to your question is: probably not. We should probably be taking the
per-device semaphore to prevent against competing events that are trying
to add/remove a device or driver.
> Ouch and BTW *right* value is _2_, today... because state_store() uses
> value from user as a pm_message_t.event. We should at least supply
> some hint that it is runtime suspend in pm_message_t.flags --
> unfortunately we do not have pm_message_t.flags yet.
And I don't think that we want it. I think that we want a completely
different API for doing runtime PM. Using the same API will make the
drivers more complicated by having to have a repeated control sequence for
determing what state (and under what conditions) the device is entering.
Also, recall that the actual states the device and driver support could be
different than any other device or driver (even when the device or driver
are the same). What we need is a way for a driver to easily export the
states that it supports and a way to enter those exported states (e.g. an
array or linked list of state names and callbacks, like has been mentioned
a few times before).
To that effect, the per-device power file and its semantics should go away
completely and replaced with something that supports this new API.
> > Also, is there source available?
>
> Should be, we are trying to do *Open*SUSE these days ;-))).
Heh, but where? (And, actually nevermind, Holger pointed them out.. :-)
Thanks,
Patrick
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-24 15:31 ` Holger Macht
@ 2005-12-26 20:58 ` Patrick Mochel
2005-12-27 20:04 ` Pavel Machek
0 siblings, 1 reply; 21+ messages in thread
From: Patrick Mochel @ 2005-12-26 20:58 UTC (permalink / raw)
To: Holger Macht; +Cc: linux-pm
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2578 bytes --]
On Sat, 24 Dec 2005, Holger Macht wrote:
> On Fr 23. Dez - 07:12:35, Patrick Mochel wrote:
> > How do you determine the idleness of a device? Or, is it based purely on
> > user direction?
>
> That's one of the main problems. For some devices, for instance network
> cards, we can query the NetworkManager (as far as available) via DBus if
> the device is currently in use or not. But we don't have that possiblity
> for all devices. The even more problematic point is the other way
> round. How to figure out if we have to resume a device as soon as another
> application wants to access it? I have currently no clue how to do this.
That is definitely a devlish detail (or rather, several of them). In
general, you want a device to resume as soon as it's needed by an
application. There's no way of knowing this from a 3rd party app, and
there is no way of changing every application to first power up a device.
So, it must happen in the kernel.
Where in the kernel is dependent on the type (class) of device. For some,
it will be on open(2), others will be connect(2), and others will be the
set of read(2)/write(2)/select(2)/poll(2). To this effect, I think that
the individual classes need to be changed to do e.g.
power_on(dev);
whenever a command that requires hardware access is issued. When that
operation finishes, perhaps it could do e.g.
power_off(dev);
to start an idle timer that results in the device entering a
pre-determined low-power state (optionally configurable via a per-device
sysfs file).
Note that power_on(dev) would simply cancel the timer if the device wasn't
actually in a low-power state.
Note also that this is beyond having a device enter a user-dictated state,
but will mesh nicely with it. A user-dictated state will cancel the timer
for a device and put it into the low-power state, then it will be woken
automatically once the device is used. (In essence, it would be like
specifying the idle state to enter with a timeout of 0, and have it
trigger immediately..)
Implementing all of this will take time and collaboration with the
different device classes, but I think that this is a direction which most
distros would like to head. Is that even remotely correct?
Any thoughts/ideas on that front?
> > Also, is there source available?
>
> Of course, there is. Tarballs [1] are available at sourceforge. You can
> directly browse the code [2] from the svn webinterface. The corresponding
> source files are device_management.{cpp,h} and device.{cpp,h}.
Cool, thanks. I'll to have a look in the near future.
Thanks,
Patrick
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-26 20:43 ` Patrick Mochel
@ 2005-12-26 22:33 ` Pavel Machek
2005-12-27 18:59 ` Patrick Mochel
2005-12-26 22:47 ` Alan Stern
1 sibling, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2005-12-26 22:33 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 2032 bytes --]
On Po 26-12-05 12:43:43, Patrick Mochel wrote:
> > Is there enough locking in driver core to make this safe?
>
> The issue I stated is actually orthogonal to locking the device; but the
> answer to your question is: probably not. We should probably be taking the
> per-device semaphore to prevent against competing events that are trying
> to add/remove a device or driver.
Ok, thanks. That means that some races are still there, but they
probably don't matter for non-hotpluggable stuff like PCI, right?
> > Ouch and BTW *right* value is _2_, today... because state_store() uses
> > value from user as a pm_message_t.event. We should at least supply
> > some hint that it is runtime suspend in pm_message_t.flags --
> > unfortunately we do not have pm_message_t.flags yet.
>
> And I don't think that we want it. I think that we want a completely
> different API for doing runtime PM. Using the same API will make the
> drivers more complicated by having to have a repeated control sequence for
> determing what state (and under what conditions) the device is entering.
>
> Also, recall that the actual states the device and driver support could be
> different than any other device or driver (even when the device or driver
> are the same). What we need is a way for a driver to easily export the
> states that it supports and a way to enter those exported states (e.g. an
> array or linked list of state names and callbacks, like has been mentioned
> a few times before).
>
> To that effect, the per-device power file and its semantics should go away
> completely and replaced with something that supports this new API.
Noone uses .../power API just now (except Holger :-), so I think we
can still change it. Currently it is horribly broken -- taking 0/2 and
passing it directly to pm_message_t.event. I don't think we can solve
worlds hunger today... but what about at least changing interface so
that it lists two states ("on" and "suspend") that are more or less
common to all drivers?
Pavel
--
Thanks, Sharp!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-26 20:43 ` Patrick Mochel
2005-12-26 22:33 ` Pavel Machek
@ 2005-12-26 22:47 ` Alan Stern
2005-12-27 17:29 ` Pavel Machek
1 sibling, 1 reply; 21+ messages in thread
From: Alan Stern @ 2005-12-26 22:47 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1908 bytes --]
On Mon, 26 Dec 2005, Patrick Mochel wrote:
> > Is there enough locking in driver core to make this safe?
>
> The issue I stated is actually orthogonal to locking the device; but the
> answer to your question is: probably not. We should probably be taking the
> per-device semaphore to prevent against competing events that are trying
> to add/remove a device or driver.
We are already taking that semaphore.
> > Ouch and BTW *right* value is _2_, today... because state_store() uses
> > value from user as a pm_message_t.event. We should at least supply
> > some hint that it is runtime suspend in pm_message_t.flags --
> > unfortunately we do not have pm_message_t.flags yet.
>
> And I don't think that we want it. I think that we want a completely
> different API for doing runtime PM. Using the same API will make the
> drivers more complicated by having to have a repeated control sequence for
> determing what state (and under what conditions) the device is entering.
>
> Also, recall that the actual states the device and driver support could be
> different than any other device or driver (even when the device or driver
> are the same). What we need is a way for a driver to easily export the
> states that it supports and a way to enter those exported states (e.g. an
> array or linked list of state names and callbacks, like has been mentioned
> a few times before).
>
> To that effect, the per-device power file and its semantics should go away
> completely and replaced with something that supports this new API.
Allow me to direct your attention to this posting:
http://lists.osdl.org/pipermail/linux-pm/2005-September/001421.html
and the follow-on messages. They implement exactly the type of API you're
talking about. When I have some time available I will rework the patches,
with improvements as suggested by the discussions on the mailing list, and
submit them.
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-26 22:47 ` Alan Stern
@ 2005-12-27 17:29 ` Pavel Machek
2005-12-27 17:36 ` Randy.Dunlap
0 siblings, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2005-12-27 17:29 UTC (permalink / raw)
To: Alan Stern; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 1568 bytes --]
On Po 26-12-05 17:47:23, Alan Stern wrote:
> On Mon, 26 Dec 2005, Patrick Mochel wrote:
> > To that effect, the per-device power file and its semantics should go away
> > completely and replaced with something that supports this new API.
>
> Allow me to direct your attention to this posting:
>
> http://lists.osdl.org/pipermail/linux-pm/2005-September/001421.html
>
> and the follow-on messages. They implement exactly the type of API you're
> talking about. When I have some time available I will rework the patches,
> with improvements as suggested by the discussions on the mailing list, and
> submit them.
Ok, what about this as a first step? It unbreaks the interface between
kernel and user, and allows something like patches above to be taken
in future, without userland changes.
Pavel
diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c
--- a/drivers/base/power/sysfs.c
+++ b/drivers/base/power/sysfs.c
@@ -33,15 +33,12 @@ static ssize_t state_show(struct device
static ssize_t state_store(struct device * dev, struct device_attribute *attr, const char * buf, size_t n)
{
pm_message_t state;
- char * rest;
- int error = 0;
+ int error = -EINVAL;
- state.event = simple_strtoul(buf, &rest, 10);
- if (*rest)
- return -EINVAL;
- if (state.event)
+ state.event = PM_EVENT_SUSPEND;
+ if ((n == 2) && !strncmp(buf, "on", min(n, 2)))
error = dpm_runtime_suspend(dev, state);
- else
+ if ((n == 7) && !strncmp(buf, "suspend", min(n, 7)))
dpm_runtime_resume(dev);
return error ? error : n;
}
--
Thanks, Sharp!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-27 17:29 ` Pavel Machek
@ 2005-12-27 17:36 ` Randy.Dunlap
0 siblings, 0 replies; 21+ messages in thread
From: Randy.Dunlap @ 2005-12-27 17:36 UTC (permalink / raw)
To: Pavel Machek; +Cc: linux-pm
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1727 bytes --]
On Tue, 27 Dec 2005, Pavel Machek wrote:
> On Po 26-12-05 17:47:23, Alan Stern wrote:
> > On Mon, 26 Dec 2005, Patrick Mochel wrote:
>
> > > To that effect, the per-device power file and its semantics should go away
> > > completely and replaced with something that supports this new API.
> >
> > Allow me to direct your attention to this posting:
> >
> > http://lists.osdl.org/pipermail/linux-pm/2005-September/001421.html
> >
> > and the follow-on messages. They implement exactly the type of API you're
> > talking about. When I have some time available I will rework the patches,
> > with improvements as suggested by the discussions on the mailing list, and
> > submit them.
>
> Ok, what about this as a first step? It unbreaks the interface between
> kernel and user, and allows something like patches above to be taken
> in future, without userland changes.
> Pavel
>
> diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c
> --- a/drivers/base/power/sysfs.c
> +++ b/drivers/base/power/sysfs.c
> @@ -33,15 +33,12 @@ static ssize_t state_show(struct device
> static ssize_t state_store(struct device * dev, struct device_attribute *attr, const char * buf, size_t n)
> {
> pm_message_t state;
> - char * rest;
> - int error = 0;
> + int error = -EINVAL;
>
> - state.event = simple_strtoul(buf, &rest, 10);
> - if (*rest)
> - return -EINVAL;
> - if (state.event)
> + state.event = PM_EVENT_SUSPEND;
> + if ((n == 2) && !strncmp(buf, "on", min(n, 2)))
> error = dpm_runtime_suspend(dev, state);
> - else
> + if ((n == 7) && !strncmp(buf, "suspend", min(n, 7)))
> dpm_runtime_resume(dev);
> return error ? error : n;
> }
You can drop the min(n, 2 or 7) since n == 2 or 7.
--
~Randy
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-26 22:33 ` Pavel Machek
@ 2005-12-27 18:59 ` Patrick Mochel
2005-12-27 19:22 ` Pavel Machek
0 siblings, 1 reply; 21+ messages in thread
From: Patrick Mochel @ 2005-12-27 18:59 UTC (permalink / raw)
To: Pavel Machek; +Cc: linux-pm
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1901 bytes --]
On Mon, 26 Dec 2005, Pavel Machek wrote:
> On Po 26-12-05 12:43:43, Patrick Mochel wrote:
> > > Is there enough locking in driver core to make this safe?
> >
> > The issue I stated is actually orthogonal to locking the device; but the
> > answer to your question is: probably not. We should probably be taking the
> > per-device semaphore to prevent against competing events that are trying
> > to add/remove a device or driver.
>
> Ok, thanks. That means that some races are still there, but they
> probably don't matter for non-hotpluggable stuff like PCI, right?
Well, the semapohre blocks against driver probing and removing, too.
> Noone uses .../power API just now (except Holger :-), so I think we
> can still change it. Currently it is horribly broken -- taking 0/2 and
> passing it directly to pm_message_t.event. I don't think we can solve
> worlds hunger today... but what about at least changing interface so
> that it lists two states ("on" and "suspend") that are more or less
> common to all drivers?
I don't think we want to do something from the core that is common to all
drivers in that area. The PM values and semantics are so different across
that I don't think we could effectively abstract something for everything.
What we could do is remove the file from being added in the core, and have
the PCI core add it for all PCI devices that have PM capabilities and show
all the states that appear in the config space. From there, we could start
adding logic to filter out unsupported states and adding extra states that
the driver supports (that aren't in the config space, assuming that they
exist for some devices).
In short, you're right - we can't solve world hunger, but we should be
able to make some good progress on at least one continent. Once I'm back
to a development machine (and not a coffee shop wireless connection), I'll
try to cook up a patch..
Patrick
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-27 18:59 ` Patrick Mochel
@ 2005-12-27 19:22 ` Pavel Machek
2005-12-27 19:29 ` Patrick Mochel
0 siblings, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2005-12-27 19:22 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm
On Út 27-12-05 10:59:44, Patrick Mochel wrote:
>
> On Mon, 26 Dec 2005, Pavel Machek wrote:
>
> > On Po 26-12-05 12:43:43, Patrick Mochel wrote:
> > > > Is there enough locking in driver core to make this safe?
> > >
> > > The issue I stated is actually orthogonal to locking the device; but the
> > > answer to your question is: probably not. We should probably be taking the
> > > per-device semaphore to prevent against competing events that are trying
> > > to add/remove a device or driver.
> >
> > Ok, thanks. That means that some races are still there, but they
> > probably don't matter for non-hotpluggable stuff like PCI, right?
>
> Well, the semapohre blocks against driver probing and removing, too.
Ahha, so rmmod while banging .../power is not a good idea. Ok.
> > Noone uses .../power API just now (except Holger :-), so I think we
> > can still change it. Currently it is horribly broken -- taking 0/2 and
> > passing it directly to pm_message_t.event. I don't think we can solve
> > worlds hunger today... but what about at least changing interface so
> > that it lists two states ("on" and "suspend") that are more or less
> > common to all drivers?
>
> I don't think we want to do something from the core that is common to all
> drivers in that area. The PM values and semantics are so different across
> that I don't think we could effectively abstract something for
> everything.
Well... at least "fully-on" and
"power-down-hardware-as-much-as-possible" are common to all the
drivers. And Ben on ppc does not freeze userspace while doing suspend,
so we are not really exposing anything new.
> What we could do is remove the file from being added in the core, and have
> the PCI core add it for all PCI devices that have PM capabilities and show
> all the states that appear in the config space. From there, we could
> start
I don't really think we want complexity of putting PCI device into
D0/D1/D2/D3hot/D3cold. All that userspace should care about is device
working/device suspended, and we could not test all 5 states, anyway.
Pavel
--
Thanks, Sharp!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-27 19:22 ` Pavel Machek
@ 2005-12-27 19:29 ` Patrick Mochel
2005-12-27 19:41 ` Pavel Machek
0 siblings, 1 reply; 21+ messages in thread
From: Patrick Mochel @ 2005-12-27 19:29 UTC (permalink / raw)
To: Pavel Machek; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 946 bytes --]
On Tue, Dec 27, 2005 at 08:22:54PM +0100, Pavel Machek wrote:
> I don't really think we want complexity of putting PCI device into
> D0/D1/D2/D3hot/D3cold. All that userspace should care about is device
> working/device suspended, and we could not test all 5 states, anyway.
What do you mean?
The devices and drivers should support various states, and that's the
whole point of having multiple states - to make a choice based on the
power saving required vs. the latency requirements of bringing it back.
Granted, for most things, the latency to return from D3 (hot only, cold
is irrelevant during runtime) is not going to be noticable, so that's
probably the only state most devices will ever enter.
But, in some cases, peple are going to care about the intermediate
states, and we'll need to support them. It's simple enought to know
what states a PCI device supports, so I don't understand where the
complexity comes in.. ?
Patrick
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-27 19:29 ` Patrick Mochel
@ 2005-12-27 19:41 ` Pavel Machek
2005-12-27 20:40 ` Patrick Mochel
0 siblings, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2005-12-27 19:41 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm
On Út 27-12-05 11:29:56, Patrick Mochel wrote:
> On Tue, Dec 27, 2005 at 08:22:54PM +0100, Pavel Machek wrote:
>
> > I don't really think we want complexity of putting PCI device into
> > D0/D1/D2/D3hot/D3cold. All that userspace should care about is device
> > working/device suspended, and we could not test all 5 states, anyway.
>
> What do you mean?
>
> The devices and drivers should support various states, and that's the
> whole point of having multiple states - to make a choice based on the
> power saving required vs. the latency requirements of bringing it back.
>
> Granted, for most things, the latency to return from D3 (hot only, cold
> is irrelevant during runtime) is not going to be noticable, so that's
> probably the only state most devices will ever enter.
Exactly. And for these "most devices", having to test/debug/support
D1/D2 is not worth the effort.
> But, in some cases, peple are going to care about the intermediate
> states, and we'll need to support them. It's simple enought to know
> what states a PCI device supports, so I don't understand where the
> complexity comes in.. ?
Someone has to test all that... Unless we have in-tree driver that
wants use intermediate states, I think supporting them is bad idea.
Pavel
--
Thanks, Sharp!
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-26 20:58 ` Patrick Mochel
@ 2005-12-27 20:04 ` Pavel Machek
2005-12-27 20:54 ` Patrick Mochel
0 siblings, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2005-12-27 20:04 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 2076 bytes --]
On Po 26-12-05 12:58:02, Patrick Mochel wrote:
> On Sat, 24 Dec 2005, Holger Macht wrote:
> > On Fr 23. Dez - 07:12:35, Patrick Mochel wrote:
>
> > > How do you determine the idleness of a device? Or, is it based purely on
> > > user direction?
> >
> > That's one of the main problems. For some devices, for instance network
> > cards, we can query the NetworkManager (as far as available) via DBus if
> > the device is currently in use or not. But we don't have that possiblity
> > for all devices. The even more problematic point is the other way
> > round. How to figure out if we have to resume a device as soon as another
> > application wants to access it? I have currently no clue how to do this.
>
> That is definitely a devlish detail (or rather, several of them). In
> general, you want a device to resume as soon as it's needed by an
> application. There's no way of knowing this from a 3rd party app, and
> there is no way of changing every application to first power up a device.
> So, it must happen in the kernel.
While we want to implement that someday, we should fix the easy stuff.
> Where in the kernel is dependent on the type (class) of device. For some,
> it will be on open(2), others will be connect(2), and others will be the
> set of read(2)/write(2)/select(2)/poll(2). To this effect, I think that
> the individual classes need to be changed to do e.g.
And for some, it is not possible to keep them functional and powered
down. Take ethernet interface: if you power it down, you'll not see
incoming packets.
> Implementing all of this will take time and collaboration with the
> different device classes, but I think that this is a direction which most
> distros would like to head. Is that even remotely correct?
Would like to head is correct... but that's year+ away. For now, we'd
like to get the simpler "power this piece of hardware down, return
errors to userspace", first.
Example usage is... you are traveling by train, so you know you are
not using ethernet and wifi... so you just force it down.
Pavel
--
Thanks, Sharp!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-27 19:41 ` Pavel Machek
@ 2005-12-27 20:40 ` Patrick Mochel
2005-12-27 21:06 ` Pavel Machek
0 siblings, 1 reply; 21+ messages in thread
From: Patrick Mochel @ 2005-12-27 20:40 UTC (permalink / raw)
To: Pavel Machek; +Cc: linux-pm
On Tue, 27 Dec 2005, Pavel Machek wrote:
> On Út 27-12-05 11:29:56, Patrick Mochel wrote:
> > But, in some cases, peple are going to care about the intermediate
> > states, and we'll need to support them. It's simple enought to know
> > what states a PCI device supports, so I don't understand where the
> > complexity comes in.. ?
>
> Someone has to test all that... Unless we have in-tree driver that
> wants use intermediate states, I think supporting them is bad idea.
I don't understand what your issue is; perhaps you could explain it a bit
better. There is no additional work on the part of the core, nor any
additional complication in supporting 4 possible power states vs
supporting 2 vs supporting any arbitrary number that the driver and device
implement.
I did not suggest that we make the drivers support intermediate states; I
only stated that there are some drivers and devices that can and will
implement them, and that there are cases in which it will make sense to
enter them from userpsace.
If the sysfs files are simply checking string values and making a
driver-based call based on the validity of the string, then it doesn't
matter a) what the strings are or b) how many there are.
I admit I made a mistake in suggesting the interface - we wouldn't want to
simply export all the PM states that are present in the device's config
space. We would really only want to export the states that the driver says
it supports. But, that could be any number of states, above and beyond (or
a subset of) D1/D2/D3/etc (or whatever the bus spec says are standard
states).
As far as testing, how does that factor into the picture? Do you feel that
we currently (or will) lack coverage testing? If so, I agree. However, I
do not think that should prevent us from thinking clearly about what the
best possible solution is.
If there is a need for more and/or better testing then we can collectively
gather the resources to accomplish it.
Thanks,
Patrick
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-27 20:04 ` Pavel Machek
@ 2005-12-27 20:54 ` Patrick Mochel
0 siblings, 0 replies; 21+ messages in thread
From: Patrick Mochel @ 2005-12-27 20:54 UTC (permalink / raw)
To: Pavel Machek; +Cc: linux-pm
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3003 bytes --]
On Tue, 27 Dec 2005, Pavel Machek wrote:
> On Po 26-12-05 12:58:02, Patrick Mochel wrote:
> > On Sat, 24 Dec 2005, Holger Macht wrote:
> > > On Fr 23. Dez - 07:12:35, Patrick Mochel wrote:
> >
> > > > How do you determine the idleness of a device? Or, is it based purely on
> > > > user direction?
> > >
> > > That's one of the main problems. For some devices, for instance network
> > > cards, we can query the NetworkManager (as far as available) via DBus if
> > > the device is currently in use or not. But we don't have that possiblity
> > > for all devices. The even more problematic point is the other way
> > > round. How to figure out if we have to resume a device as soon as another
> > > application wants to access it? I have currently no clue how to do this.
> >
> > That is definitely a devlish detail (or rather, several of them). In
> > general, you want a device to resume as soon as it's needed by an
> > application. There's no way of knowing this from a 3rd party app, and
> > there is no way of changing every application to first power up a device.
> > So, it must happen in the kernel.
>
> While we want to implement that someday, we should fix the easy stuff.
Please see below.
> > Where in the kernel is dependent on the type (class) of device. For some,
> > it will be on open(2), others will be connect(2), and others will be the
> > set of read(2)/write(2)/select(2)/poll(2). To this effect, I think that
> > the individual classes need to be changed to do e.g.
>
> And for some, it is not possible to keep them functional and powered
> down. Take ethernet interface: if you power it down, you'll not see
> incoming packets.
That's a valid point, but that doesn't invalidate the concept for other
types of devices.
> > Implementing all of this will take time and collaboration with the
> > different device classes, but I think that this is a direction which most
> > distros would like to head. Is that even remotely correct?
>
> Would like to head is correct... but that's year+ away. For now, we'd
> like to get the simpler "power this piece of hardware down, return
> errors to userspace", first.
I'm not talking about now or talking about the easy stuff, I'm talking
about the future and about the hard stuff. It's been evident that we need
a better and more structured solution to the entire scheme for quite some
time.
If it takes a year+, then it takes a year+. But, the important thing is
that we have the discussion. And, while you've probably been involved in
internal discussions about powersave with its developers, it's the first
time that I've interacted with them publically, and I'm very interested in
what they have to say.. :-)
> Example usage is... you are traveling by train, so you know you are
> not using ethernet and wifi... so you just force it down.
That's on ifdown(1), and that's probably the right place to implement the
network-based PM. What system calls does that translate to? Are they only
netdev-based ioctls?
Thanks,
Patrick
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Runtime device power management in userspace
2005-12-27 20:40 ` Patrick Mochel
@ 2005-12-27 21:06 ` Pavel Machek
0 siblings, 0 replies; 21+ messages in thread
From: Pavel Machek @ 2005-12-27 21:06 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm
On Út 27-12-05 12:40:10, Patrick Mochel wrote:
> On Tue, 27 Dec 2005, Pavel Machek wrote:
> > On Út 27-12-05 11:29:56, Patrick Mochel wrote:
> > > But, in some cases, peple are going to care about the intermediate
> > > states, and we'll need to support them. It's simple enought to know
> > > what states a PCI device supports, so I don't understand where the
> > > complexity comes in.. ?
> >
> > Someone has to test all that... Unless we have in-tree driver that
> > wants use intermediate states, I think supporting them is bad idea.
>
> I don't understand what your issue is; perhaps you could explain it a bit
> better. There is no additional work on the part of the core, nor any
> additional complication in supporting 4 possible power states vs
> supporting 2 vs supporting any arbitrary number that the driver and device
> implement.
Well, I can do two states today, with the patch below :-). Actually it
should work today as-is: writing 0/2 does mostly right think now.
> I did not suggest that we make the drivers support intermediate states; I
> only stated that there are some drivers and devices that can and will
> implement them, and that there are cases in which it will make sense to
> enter them from userpsace.
Ok.
> I admit I made a mistake in suggesting the interface - we wouldn't want to
> simply export all the PM states that are present in the device's config
> space. We would really only want to export the states that the driver says
> it supports. But, that could be any number of states, above and
> beyond (or
Ok.
Pavel
diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c
--- a/drivers/base/power/sysfs.c
+++ b/drivers/base/power/sysfs.c
@@ -33,15 +33,12 @@ static ssize_t state_show(struct device
static ssize_t state_store(struct device * dev, struct device_attribute *attr, const char * buf, size_t n)
{
pm_message_t state;
- char * rest;
- int error = 0;
+ int error = -EINVAL;
- state.event = simple_strtoul(buf, &rest, 10);
- if (*rest)
- return -EINVAL;
- if (state.event)
+ state.event = PM_EVENT_SUSPEND;
+ if ((n == 2) && !strncmp(buf, "on", min(n, 2)))
error = dpm_runtime_suspend(dev, state);
- else
+ if ((n == 7) && !strncmp(buf, "suspend", min(n, 7)))
dpm_runtime_resume(dev);
return error ? error : n;
}
--
Thanks, Sharp!
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2005-12-27 21:06 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-23 14:30 Runtime device power management in userspace Holger Macht
2005-12-23 15:12 ` Patrick Mochel
2005-12-24 0:40 ` Pavel Machek
2005-12-26 20:43 ` Patrick Mochel
2005-12-26 22:33 ` Pavel Machek
2005-12-27 18:59 ` Patrick Mochel
2005-12-27 19:22 ` Pavel Machek
2005-12-27 19:29 ` Patrick Mochel
2005-12-27 19:41 ` Pavel Machek
2005-12-27 20:40 ` Patrick Mochel
2005-12-27 21:06 ` Pavel Machek
2005-12-26 22:47 ` Alan Stern
2005-12-27 17:29 ` Pavel Machek
2005-12-27 17:36 ` Randy.Dunlap
2005-12-24 15:31 ` Holger Macht
2005-12-26 20:58 ` Patrick Mochel
2005-12-27 20:04 ` Pavel Machek
2005-12-27 20:54 ` Patrick Mochel
2005-12-23 15:17 ` Alan Stern
2005-12-24 0:41 ` Pavel Machek
2005-12-24 0:43 ` Pavel Machek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox