* RE: RE: on-ness
@ 2006-04-18 18:39 Brown, Len
2006-04-20 13:25 ` Pavel Machek
0 siblings, 1 reply; 23+ messages in thread
From: Brown, Len @ 2006-04-18 18:39 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 2695 bytes --]
(restored linux-pm to cc:)
yes, the concept was to explore a language generic enough
that it could describe either CPUs or any other devices.
no, this wasn't expected to be complete, just a start.
yes, wakeup capability is one thing we we discussed.
note that we may need to differentiate between enabling
the device to wakeup the system, and waking up just
the device itself.
no, if the numbers happen to corrolate to ACPI states,
that is purely lucky for the ACPI maintainer:-)
But honestly, we should "leverage" all we can from ACPI.
My role, of course, it to make sure that the generic
description can accommodate everything ACPI can do.
yes, idle refers to either Cx or Dx states -- it depends
on the device. 0 means operating, non zero means non-operating.
Re: strings.
we breifly debated strings vs numbers.
I'm not confident that there are enough unique strings
available without falling into state0, state1, state2, state3 --
so numbers seemed simpler.
The only issue I see with numbers is that they imply order,
and it may be that some operating points might not have
such a strict order.
appology in advace for the top-post, Intel IT re-installed my
laptop and a lot of things are not "quite right"...
-Len
-----Original Message-----
From: David Brownell [mailto:david-b@pacbell.net]
Sent: Tuesday, April 18, 2006 2:15 PM
To: Brown, Len
Cc: Richard A. Griffiths
Subject: Re: [linux-pm] RE: on-ness
On Monday 17 April 2006 2:43 pm, Brown, Len wrote:
> > Thinking about the discussion of the ON field. How about Limiter?
Then
> > maps to no limit (max power, max freq, whatever) and any
> > other number is
> > some limit of performance/power, similar to what was decided for
Idle.
>
> my scribbles on generic sysfs device directory file names say:
That is, you're talking about parameters related to individual devices?
Or CPUs?
Let's surface more parameters first, before talking about managers!
One more is the wakeup capability; devices may easily have two parameter
sets, one spending a bit of power to deliver wakeup capability.
Another is which <linux/clk.h> clocks are associated with the device.
> state:
> on - running and available
> off - requires a full device initialization to be usable
Also e.g. "bus suspend" causing maybe 2 mA current draw vs normal
usb root port power budgets of 100+ mA ... two modes like that, one
supporting remote wakeup and one not.
> idle: # = "how idle"
> 0 - active, not idle at all eg C0, D0
> 1 - idle. eg C1, D1
> ...
> n - most power saving, highest latency idle state, eg. Cn, Dn
These are ACPI CPU states Cx? Or device states Dx? Such values
should be a string, which need not match acpi conventions...
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-18 18:39 RE: on-ness Brown, Len
@ 2006-04-20 13:25 ` Pavel Machek
2006-04-21 15:27 ` David Brownell
0 siblings, 1 reply; 23+ messages in thread
From: Pavel Machek @ 2006-04-20 13:25 UTC (permalink / raw)
To: Brown, Len; +Cc: David Brownell, linux-pm
[-- Attachment #1: Type: text/plain, Size: 1699 bytes --]
Hi!
> (restored linux-pm to cc:)
> no, if the numbers happen to corrolate to ACPI states,
> that is purely lucky for the ACPI maintainer:-)
> But honestly, we should "leverage" all we can from ACPI.
> My role, of course, it to make sure that the generic
> description can accommodate everything ACPI can do.
Please, put there translation layer from the begining. Otherwise
people will assme they *are* ACPI states, and great confusion will
begin.
See suspend functions, where half people assumed they are acpi Dx
states, half people thought they are pci Dx states, and half the
people assumed they are system Sx states.
It took quite long to sort out.
> yes, idle refers to either Cx or Dx states -- it depends
> on the device. 0 means operating, non zero means non-operating.
>
> Re: strings.
> we breifly debated strings vs numbers.
> I'm not confident that there are enough unique strings
> available without falling into state0, state1, state2, state3 --
> so numbers seemed simpler.
> The only issue I see with numbers is that they imply order,
> and it may be that some operating points might not have
> such a strict order.
Second issue is that you might want to "insert" between states.
Lets look at disk. Old disks had:
spinning
spindown
states. You'd name them 0 and 1, eventually apps will learn to use
that. But (some) newer disks have
spinning
slowspin
spindown
if app does echo 1 > state, it will get slowspin instead of spindown
it wanted.
Yes, inventing good names may be tricky in some cases, but in other
cases names are very natural (arm has sleep, deep sleep and big sleep,
iirc).... And we can always fall back to state0..state5.
Pavel
--
Thanks, Sharp!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-20 13:25 ` Pavel Machek
@ 2006-04-21 15:27 ` David Brownell
2006-04-21 15:40 ` Dominik Brodowski
2006-04-21 17:15 ` David Brownell
0 siblings, 2 replies; 23+ messages in thread
From: David Brownell @ 2006-04-21 15:27 UTC (permalink / raw)
To: Pavel Machek; +Cc: linux-pm
[-- Attachment #1: Type: text/plain, Size: 4175 bytes --]
On Thursday 20 April 2006 6:25 am, Pavel Machek wrote:
> > no, if the numbers happen to corrolate to ACPI states,
> > that is purely lucky for the ACPI maintainer:-)
> > But honestly, we should "leverage" all we can from ACPI.
> > My role, of course, it to make sure that the generic
> > description can accommodate everything ACPI can do.
>
> Please, put there translation layer from the begining. Otherwise
> people will assme they *are* ACPI states, and great confusion will
> begin.
Better yet, don't use numbers, since that's the root cause of the
problem. Typed enums are OK. (But of course, ACPI-specific enums
should appear *only* inside the ACPI code...)
> See suspend functions, where half people assumed they are acpi Dx
> states, half people thought they are pci Dx states, and half the
> people assumed they are system Sx states.
When I did the research on this a while back, it was self-evident
that the original 2.4 suspend calls expected system Sx states.
Otherwise the drivers which looked at the parameter and used it
when selecting a target device state would have made no sense.
(Likewise, all those drivers had to **remove functionality** as
part of the pm_message_t search'n'destroy mission...)
There was mild confusion about them being PCI_Dx states, mostly
because the simplest mostly-works mapping of ACPI_Sx states to
PCI_Dx states was the identity mapping. And especially for PM,
many driver writers are lazy.
I don't ever recall seeing drivers assume the parameter was an
ACPI_Dx state. That would have been deeply wrong, since most
hardware will never run ACPI. What I did see was that the first
incarnation of the 2.6 power management framework changed the
state numbers in a way that broke most of the drivers relying on
that identity mapping ... and I also saw disagreement between that
framework and the swsusp code about what numbers to pass.
All that's resolved now, but with a net loss of functionality.
And yes, the root cause was the initial use of (untyped) numbers,
where a translation layer would have reduced trouble. But not
using numbers could have avoided the problems entirely!
> > Re: strings.
> > we breifly debated strings vs numbers.
> > I'm not confident that there are enough unique strings
> > available without falling into state0, state1, state2, state3 --
> > so numbers seemed simpler.
This is called a "lack of imagination". ;)
Most SOC specs I look at don't use numbers to describe either
device, CPU, or system states ... many don't even distinguish
CPU sleep/idle states from system states. They use a variety
of names to describe various aspects of the hardware states, and
often the issues are more like "which clocks are available".
(Or, for system states, "clock domains" ... e.g. "everything
derived from oscillator X" or tied to a given PLL.)
> > The only issue I see with numbers is that they imply order,
> > and it may be that some operating points might not have
> > such a strict order.
In my observation, "strict order" would be the exception not
the rule. There are often three or four orthogonal factors,
and they don't naturally fit any two-dimensional linear order.
> Second issue is that you might want to "insert" between states.
> ...
Of course, "between" implies some strict/linear order...
> Yes, inventing good names may be tricky in some cases, but in other
> cases names are very natural (arm has sleep, deep sleep and big sleep,
> iirc).... And we can always fall back to state0..state5.
Well, OMAP is one implementation that uses ARM, and it certainly has
"deep sleep" and "big sleep". But other ARM based SOCs provide very
different power abstractions (consider "slow clock mode", "idle",
"frozen", "standby", "stop", "sleep") and may use the same names to
indicate different things. System state names are system specific.
If ACPI wants to use names like "ACPI_S0".."ACPI_S5", that's fine;
but Linux should not inflict such an approach on systems that don't
use ACPI. Developers might find it handy to contrast one SOC's
"deep sleep" to "ACPI_S3" (or to "deep sleep" on another SOC), but
it won't be an exact match; square peg, round hole.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-21 15:27 ` David Brownell
@ 2006-04-21 15:40 ` Dominik Brodowski
2006-04-21 17:03 ` David Brownell
2006-04-21 17:15 ` David Brownell
1 sibling, 1 reply; 23+ messages in thread
From: Dominik Brodowski @ 2006-04-21 15:40 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 2231 bytes --]
On Fri, Apr 21, 2006 at 08:27:32AM -0700, David Brownell wrote:
> > > The only issue I see with numbers is that they imply order,
> > > and it may be that some operating points might not have
> > > such a strict order.
>
> In my observation, "strict order" would be the exception not
> the rule. There are often three or four orthogonal factors,
> and they don't naturally fit any two-dimensional linear order.
We need to distinguish two aspects here -- the "whole system states", which
in fact create a multi-dimensional problem, and one specific attribute of
one specific device. The performance level of one specific networking device,
for example. Or its sleep state. Or, if you can describe sub-aspects of a
networking device, the performance level or the sleep state of that
sub-device. So each strict-order parameter has its own file (that's why
the RFC mentioned three files for CPUs in the ACPI-model, for performance,
idle and throttling; different CPUs in different, especially embedded
surroundings may require additional files).
> > Yes, inventing good names may be tricky in some cases, but in other
> > cases names are very natural (arm has sleep, deep sleep and big sleep,
> > iirc).... And we can always fall back to state0..state5.
>
> Well, OMAP is one implementation that uses ARM, and it certainly has
> "deep sleep" and "big sleep". But other ARM based SOCs provide very
> different power abstractions (consider "slow clock mode", "idle",
> "frozen", "standby", "stop", "sleep") and may use the same names to
> indicate different things. System state names are system specific.
Well, the big problem with names and anything "system specific" is that it
makes _abstractions_ harder. It makes userspace's life harder, as it needs
to know what "idle" means on a specific system, instead.
> If ACPI wants to use names like "ACPI_S0".."ACPI_S5", that's fine;
> but Linux should not inflict such an approach on systems that don't
> use ACPI. Developers might find it handy to contrast one SOC's
> "deep sleep" to "ACPI_S3" (or to "deep sleep" on another SOC), but
> it won't be an exact match; square peg, round hole.
Here you're talking about "system states" instead of "device states" again.
Dominik
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-21 15:40 ` Dominik Brodowski
@ 2006-04-21 17:03 ` David Brownell
2006-04-21 17:12 ` Dominik Brodowski
0 siblings, 1 reply; 23+ messages in thread
From: David Brownell @ 2006-04-21 17:03 UTC (permalink / raw)
To: Dominik Brodowski; +Cc: linux-pm, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 4999 bytes --]
On Friday 21 April 2006 8:40 am, Dominik Brodowski wrote:
> On Fri, Apr 21, 2006 at 08:27:32AM -0700, David Brownell wrote:
> > > > The only issue I see with numbers is that they imply order,
> > > > and it may be that some operating points might not have
> > > > such a strict order.
> >
> > In my observation, "strict order" would be the exception not
> > the rule. There are often three or four orthogonal factors,
> > and they don't naturally fit any two-dimensional linear order.
>
> We need to distinguish two aspects here -- the "whole system states", which
> in fact create a multi-dimensional problem, and one specific attribute of
> one specific device.
Individual device operating points are multi-dimensional too. Controllers
are just mini-systems after all, and some of the device attributes will
be constrained by system attributes ...
> The performance level of one specific networking device,
> for example. Or its sleep state.
... e.g. if it's suspended during system-wide "slow clock mode" it might
be pretty well unusable, except maybe for PHY based wakeup events; but
if it's suspended during a more functional system state, enough clocks
may be available for wake-on-LAN may behave.
And then there can be PM-aware drivers that keep idle devices in low
power states whenever that's possible (activating on TX or from
wake-on-LAN RX) ... so there might be no hardware-level differences
between states before and after suspend(), other than that suspend()
always shutting down the network stack state (TX queues etc) too.
> Or, if you can describe sub-aspects of a
> networking device, the performance level or the sleep state of that
> sub-device. So each strict-order parameter has its own file (that's why
> the RFC mentioned three files for CPUs in the ACPI-model, for performance,
> idle and throttling; different CPUs in different, especially embedded
> surroundings may require additional files).
I seem to have missed some RFC... the note on "on-ness" I responded to
had some musings, but no evident proposal. (It wasn't even clear
what it was describing...) The embedded CPUs I've worked with wouldn't
have much need for "idle" and "throttling" controls, either.
> > > Yes, inventing good names may be tricky in some cases, but in other
> > > cases names are very natural (arm has sleep, deep sleep and big sleep,
> > > iirc).... And we can always fall back to state0..state5.
> >
> > Well, OMAP is one implementation that uses ARM, and it certainly has
> > "deep sleep" and "big sleep". But other ARM based SOCs provide very
> > different power abstractions (consider "slow clock mode", "idle",
> > "frozen", "standby", "stop", "sleep") and may use the same names to
> > indicate different things. System state names are system specific.
>
> Well, the big problem with names and anything "system specific" is that it
> makes _abstractions_ harder. It makes userspace's life harder, as it needs
> to know what "idle" means on a specific system, instead.
If by "userspace" we can mean just "what writes the /sys/power/state file",
it's straightforward for a given system to provide mappings between some
common tokens ("standby", "mem", etc) to a system-specific meaning.
Of course, those tokens don't necessarily expose all the meanings that
are wanted for managing power on that system. But they're probably a
reasonable start. The code behind a system's pm_ops will package a lot
of decisions about the operating points associated with "standby" etc,
and it's a bit of work to make sure any given operating point is both
sensible and functional.
> > If ACPI wants to use names like "ACPI_S0".."ACPI_S5", that's fine;
> > but Linux should not inflict such an approach on systems that don't
> > use ACPI. Developers might find it handy to contrast one SOC's
> > "deep sleep" to "ACPI_S3" (or to "deep sleep" on another SOC), but
> > it won't be an exact match; square peg, round hole.
>
> Here you're talking about "system states" instead of "device states" again.
The same points hold for "ACPI_D0"..."ACPI_D3" states. If a device may need
up to four clocks for its fully operational state, but doesn't need all of
them for more typical reduced-function (and reduced-power) states, there's
not necesssarily going to be a natural linear/"strict" sequencing of those
states. "More power for task one and less power to task two" might be just
as much power as "more-for-two/less-for-one" ... and from the userspace
perspective, they could act just like "full power for both" for drivers
that automatically handle the transitions.
And those might all map to ACPI_D0 states ... there's still the same peg/hole
problem, coming from thinking that details of the ACPI model should apply.
(That is, the ACPI model makes general sense when talking about different
device and system states but not when trying to define details -- which
should be system-specific! -- about what those states should be/model, and
how they should behave.)
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: RE: on-ness
2006-04-21 17:03 ` David Brownell
@ 2006-04-21 17:12 ` Dominik Brodowski
2006-04-21 18:30 ` David Brownell
0 siblings, 1 reply; 23+ messages in thread
From: Dominik Brodowski @ 2006-04-21 17:12 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 3769 bytes --]
On Fri, Apr 21, 2006 at 10:03:40AM -0700, David Brownell wrote:
> > We need to distinguish two aspects here -- the "whole system states", which
> > in fact create a multi-dimensional problem, and one specific attribute of
> > one specific device.
>
> Individual device operating points are multi-dimensional too. Controllers
> are just mini-systems after all, and some of the device attributes will
> be constrained by system attributes ...
Exactly, that is what I was referring to as well :)
> > The performance level of one specific networking device,
> > for example. Or its sleep state.
>
> ... e.g. if it's suspended during system-wide "slow clock mode" it might
> be pretty well unusable, except maybe for PHY based wakeup events; but
> if it's suspended during a more functional system state, enough clocks
> may be available for wake-on-LAN may behave.
Sure. But again you're mixing system state and device state. Of course the
former may constrain the latter, and vice versa.
> > Or, if you can describe sub-aspects of a
> > networking device, the performance level or the sleep state of that
> > sub-device. So each strict-order parameter has its own file (that's why
> > the RFC mentioned three files for CPUs in the ACPI-model, for performance,
> > idle and throttling; different CPUs in different, especially embedded
> > surroundings may require additional files).
>
> I seem to have missed some RFC... the note on "on-ness" I responded to
> had some musings, but no evident proposal.
Mis-naming on my part - I was just referring to the "on-ness" proposal too.
> The embedded CPUs I've worked with wouldn't
> have much need for "idle" and "throttling" controls, either.
Other embedded CPUs do, though... and we're trying to abstract things here,
right?
> > Well, the big problem with names and anything "system specific" is that it
> > makes _abstractions_ harder. It makes userspace's life harder, as it needs
> > to know what "idle" means on a specific system, instead.
>
> If by "userspace" we can mean just "what writes the /sys/power/state file",
> it's straightforward for a given system to provide mappings between some
> common tokens ("standby", "mem", etc) to a system-specific meaning.
Uh. Not /sys/power/state. But /sys/devices/...../power/{[a],[b],[c]} where
[a], [b] and [c] need sensible names.
> Of course, those tokens don't necessarily expose all the meanings that
> are wanted for managing power on that system. But they're probably a
> reasonable start. The code behind a system's pm_ops will package a lot
> of decisions about the operating points associated with "standby" etc,
> and it's a bit of work to make sure any given operating point is both
> sensible and functional.
The "on-ness" thingy was about device power sates, AFAIK, but not about
/sys/power/state. So not about what some call "operating points" of the
system, but only about specific settings of a "device" or "part of the
system".
> The same points hold for "ACPI_D0"..."ACPI_D3" states. If a device may need
> up to four clocks for its fully operational state, but doesn't need all of
> them for more typical reduced-function (and reduced-power) states, there's
> not necesssarily going to be a natural linear/"strict" sequencing of those
> states. "More power for task one and less power to task two" might be just
> as much power as "more-for-two/less-for-one" ... and from the userspace
> perspective, they could act just like "full power for both" for drivers
> that automatically handle the transitions.
Yes. That's why there is talk about having different files describing a
device, and not just one. So you might have four files describing these four
clocks... and yet another file for describing the non-working states.
Dominik
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: RE: on-ness
2006-04-21 17:12 ` Dominik Brodowski
@ 2006-04-21 18:30 ` David Brownell
2006-04-21 18:33 ` Dominik Brodowski
0 siblings, 1 reply; 23+ messages in thread
From: David Brownell @ 2006-04-21 18:30 UTC (permalink / raw)
To: Dominik Brodowski; +Cc: linux-pm, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 4223 bytes --]
On Friday 21 April 2006 10:12 am, Dominik Brodowski wrote:
> On Fri, Apr 21, 2006 at 10:03:40AM -0700, David Brownell wrote:
> > > We need to distinguish two aspects here -- the "whole system states", which
> > > in fact create a multi-dimensional problem, and one specific attribute of
> > > one specific device.
> >
> > Individual device operating points are multi-dimensional too. Controllers
> > are just mini-systems after all, and some of the device attributes will
> > be constrained by system attributes ...
>
> Exactly, that is what I was referring to as well :)
Good, it helps when folk are on the same page!
> > The embedded CPUs I've worked with wouldn't
> > have much need for "idle" and "throttling" controls, either.
>
> Other embedded CPUs do, though... and we're trying to abstract things here,
> right?
Not exactly. I'd hope we're solving problems. Generalization makes
solutions work in multiple contexts, and abstraction helps generalization,
but IMO the goal is problem solving.
I've certainly seen cases where abstracting creates/causes problems, rather
than solving them. And creating abstractions that don't make sense on the
hardware is a good start on such confusion...
> > > Well, the big problem with names and anything "system specific" is that it
> > > makes _abstractions_ harder. It makes userspace's life harder, as it needs
> > > to know what "idle" means on a specific system, instead.
> >
> > If by "userspace" we can mean just "what writes the /sys/power/state file",
> > it's straightforward for a given system to provide mappings between some
> > common tokens ("standby", "mem", etc) to a system-specific meaning.
>
> Uh. Not /sys/power/state. But /sys/devices/...../power/{[a],[b],[c]} where
> [a], [b] and [c] need sensible names.
Well, "on" could have one defined meaning. Maybe it's the only option
available, until drivers add intelligence. I don't see any problem
with the other names being system-specific, since it's rather unlikely
that a PCI_D3hot state will ever appear on most embedded ARM boxes.
And if any userspace code tries to set power states, it had darn well
better understand exactly what's going on.
> > Of course, those tokens don't necessarily expose all the meanings that
> > are wanted for managing power on that system. But they're probably a
> > reasonable start. The code behind a system's pm_ops will package a lot
> > of decisions about the operating points associated with "standby" etc,
> > and it's a bit of work to make sure any given operating point is both
> > sensible and functional.
>
> The "on-ness" thingy was about device power sates, AFAIK, but not about
> /sys/power/state. So not about what some call "operating points" of the
> system, but only about specific settings of a "device" or "part of the
> system".
Hence my confusion in reading the original note ... Len said it applied
to both, but I found a bias towards CPU power states (rather than device
or system states).
> > The same points hold for "ACPI_D0"..."ACPI_D3" states. If a device may need
> > up to four clocks for its fully operational state, but doesn't need all of
> > them for more typical reduced-function (and reduced-power) states, there's
> > not necesssarily going to be a natural linear/"strict" sequencing of those
> > states. "More power for task one and less power to task two" might be just
> > as much power as "more-for-two/less-for-one" ... and from the userspace
> > perspective, they could act just like "full power for both" for drivers
> > that automatically handle the transitions.
>
> Yes. That's why there is talk about having different files describing a
> device, and not just one. So you might have four files describing these four
> clocks... and yet another file for describing the non-working states.
That seems too complicated to me. When debugging, I want to visualize the
entire tree ... so I'd want a /sys/kernel/debug/clocktree file, with lots
of system-specific information. (Which gate bits are set/cleared? What
speeds? etc.) Or else I just want to know which state the driver is in,
like "mostly one". Some of that is taste, but also don't forget that each
attribute in sysfs has a cost.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: RE: on-ness
2006-04-21 18:30 ` David Brownell
@ 2006-04-21 18:33 ` Dominik Brodowski
2006-04-21 19:00 ` David Brownell
2006-04-21 19:01 ` RE: on-ness Pavel Machek
0 siblings, 2 replies; 23+ messages in thread
From: Dominik Brodowski @ 2006-04-21 18:33 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 2093 bytes --]
On Fri, Apr 21, 2006 at 11:30:05AM -0700, David Brownell wrote:
> > > > Well, the big problem with names and anything "system specific" is that it
> > > > makes _abstractions_ harder. It makes userspace's life harder, as it needs
> > > > to know what "idle" means on a specific system, instead.
> > >
> > > If by "userspace" we can mean just "what writes the /sys/power/state file",
> > > it's straightforward for a given system to provide mappings between some
> > > common tokens ("standby", "mem", etc) to a system-specific meaning.
> >
> > Uh. Not /sys/power/state. But /sys/devices/...../power/{[a],[b],[c]} where
> > [a], [b] and [c] need sensible names.
>
> Well, "on" could have one defined meaning. Maybe it's the only option
> available, until drivers add intelligence. I don't see any problem
> with the other names being system-specific, since it's rather unlikely
> that a PCI_D3hot state will ever appear on most embedded ARM boxes.
> And if any userspace code tries to set power states, it had darn well
> better understand exactly what's going on.
Yes. However if a network managing userspace code wants to set the power
conusmption of a WLAN device to the lowest possible setting, it shouldn't
need a configuration file specific for each platform.
> > Yes. That's why there is talk about having different files describing a
> > device, and not just one. So you might have four files describing these four
> > clocks... and yet another file for describing the non-working states.
>
> That seems too complicated to me. When debugging, I want to visualize the
> entire tree ... so I'd want a /sys/kernel/debug/clocktree file, with lots
> of system-specific information. (Which gate bits are set/cleared? What
> speeds? etc.) Or else I just want to know which state the driver is in,
> like "mostly one". Some of that is taste, but also don't forget that each
> attribute in sysfs has a cost.
Uh, there's a rule "one-value-per-file" for sysfs. Arrays might be OK in
certain cases, but lots of system-specific information in one file? No way,
IMHO.
Thanks,
Dominik
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: RE: on-ness
2006-04-21 18:33 ` Dominik Brodowski
@ 2006-04-21 19:00 ` David Brownell
2006-04-21 19:04 ` [OT] debugfs and sysfs [Was: Re: RE: on-ness] Dominik Brodowski
2006-04-21 19:01 ` RE: on-ness Pavel Machek
1 sibling, 1 reply; 23+ messages in thread
From: David Brownell @ 2006-04-21 19:00 UTC (permalink / raw)
To: Dominik Brodowski; +Cc: linux-pm, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 1959 bytes --]
> > Well, "on" could have one defined meaning. Maybe it's the only option
> > available, until drivers add intelligence. I don't see any problem
> > with the other names being system-specific, since it's rather unlikely
> > that a PCI_D3hot state will ever appear on most embedded ARM boxes.
> > And if any userspace code tries to set power states, it had darn well
> > better understand exactly what's going on.
>
> Yes. However if a network managing userspace code wants to set the power
> conusmption of a WLAN device to the lowest possible setting, it shouldn't
> need a configuration file specific for each platform.
The current wireless extensions have power management calls, and I'd
expect that whatever replaces them would do so as well. Which brings
out a point I've made before: userspace power management should not as
a rule be driven through /sys/devices/... files, but instead as part
of the normal API to the devices. That's the best way to ensure that
the operations fit into the devices' usage models, rather than as an
ill-fitting frankenstein bolt-on. ;)
> > That seems too complicated to me. When debugging, I want to visualize the
> > entire tree ... so I'd want a /sys/kernel/debug/clocktree file, with lots
> > of system-specific information. (Which gate bits are set/cleared? What
> > speeds? etc.) Or else I just want to know which state the driver is in,
> > like "mostly one". Some of that is taste, but also don't forget that each
> > attribute in sysfs has a cost.
>
> Uh, there's a rule "one-value-per-file" for sysfs. Arrays might be OK in
> certain cases, but lots of system-specific information in one file? No way,
> IMHO.
Remember that sysfs != debugfs, and /sys/kernel/debug is debugfs. Debugfs
explicitly allows seq_printf() and friends. (And for that matter, binary
data in sysfs is more than one-value-per-file ... there are plenty of cases
where that "rule" doesn't need to be obeyed.)
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* [OT] debugfs and sysfs [Was: Re: RE: on-ness]
2006-04-21 19:00 ` David Brownell
@ 2006-04-21 19:04 ` Dominik Brodowski
0 siblings, 0 replies; 23+ messages in thread
From: Dominik Brodowski @ 2006-04-21 19:04 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 723 bytes --]
On Fri, Apr 21, 2006 at 12:00:45PM -0700, David Brownell wrote:
> > Uh, there's a rule "one-value-per-file" for sysfs. Arrays might be OK in
> > certain cases, but lots of system-specific information in one file? No way,
> > IMHO.
>
> Remember that sysfs != debugfs, and /sys/kernel/debug is debugfs.
Sorry, missed that bit. Editing my fstab right now :)
> (And for that matter, binary
> data in sysfs is more than one-value-per-file ... there are plenty of cases
> where that "rule" doesn't need to be obeyed.)
Binary data is special, indeed, but needs to be rare -- and I sincerely
doubt there are "plenty of cases" for sysfs where this rule doesn't need to
be obeyed, but that would be OT now...
Thanks,
Dominik
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-21 18:33 ` Dominik Brodowski
2006-04-21 19:00 ` David Brownell
@ 2006-04-21 19:01 ` Pavel Machek
2006-04-24 21:04 ` David Brownell
1 sibling, 1 reply; 23+ messages in thread
From: Pavel Machek @ 2006-04-21 19:01 UTC (permalink / raw)
To: Dominik Brodowski; +Cc: David Brownell, linux-pm
[-- Attachment #1: Type: text/plain, Size: 1469 bytes --]
Hi!
> > > > > Well, the big problem with names and anything "system specific" is that it
> > > > > makes _abstractions_ harder. It makes userspace's life harder, as it needs
> > > > > to know what "idle" means on a specific system, instead.
> > > >
> > > > If by "userspace" we can mean just "what writes the /sys/power/state file",
> > > > it's straightforward for a given system to provide mappings between some
> > > > common tokens ("standby", "mem", etc) to a system-specific meaning.
> > >
> > > Uh. Not /sys/power/state. But /sys/devices/...../power/{[a],[b],[c]} where
> > > [a], [b] and [c] need sensible names.
> >
> > Well, "on" could have one defined meaning. Maybe it's the only option
> > available, until drivers add intelligence. I don't see any problem
> > with the other names being system-specific, since it's rather unlikely
> > that a PCI_D3hot state will ever appear on most embedded ARM boxes.
> > And if any userspace code tries to set power states, it had darn well
> > better understand exactly what's going on.
>
> Yes. However if a network managing userspace code wants to set the power
> conusmption of a WLAN device to the lowest possible setting, it shouldn't
> need a configuration file specific for each platform.
I'd say that "on" and "off" are well defined.
For certain classes (like ethernet), other states may be common
between platforms, too, like "off-with-WOL".
Pavel
--
Thanks for all the (sleeping) penguins.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: RE: on-ness
2006-04-21 19:01 ` RE: on-ness Pavel Machek
@ 2006-04-24 21:04 ` David Brownell
2006-04-24 21:32 ` Pavel Machek
0 siblings, 1 reply; 23+ messages in thread
From: David Brownell @ 2006-04-24 21:04 UTC (permalink / raw)
To: linux-pm; +Cc: Dominik Brodowski
[-- Attachment #1: Type: text/plain, Size: 1854 bytes --]
> > > > Uh. Not /sys/power/state. But /sys/devices/...../power/{[a],[b],[c]} where
> > > > [a], [b] and [c] need sensible names.
> > >
> > > Well, "on" could have one defined meaning. Maybe it's the only option
> > > available, until drivers add intelligence. I don't see any problem
> > > with the other names being system-specific, since it's rather unlikely
> > > that a PCI_D3hot state will ever appear on most embedded ARM boxes.
> > > And if any userspace code tries to set power states, it had darn well
> > > better understand exactly what's going on.
> >
> > Yes. However if a network managing userspace code wants to set the power
> > conusmption of a WLAN device to the lowest possible setting, it shouldn't
> > need a configuration file specific for each platform.
>
> I'd say that "on" and "off" are well defined.
Are they? Does "off" imply the device will have been reset the next
time it goes to "on"? If not, there would seem to be two "off" states.
Or maybe more ... PCI_D0 is probably "on", but all of the other PCI
device states seem to be variants of "off", not of "on".
And for that matter, "on" doesn't seem to me to imply anything more
than "full functionality from external POV". That doesn't necessarily
imply "full power-on", and in fact it'd be better if it were using the
lowest power state(s) available. That state might be compatible with
certain system sleep states, or not, depending on the device's workload.
Is either "on" or "off" a suspend state? Why, or why not? :)
> For certain classes (like ethernet), other states may be common
> between platforms, too, like "off-with-WOL".
Actually the wakeup characteristics are orthogonal, there are per-device
bits controlling whether a device can and should do the wakeup. We don't
for example treat "PCI_D3hot with wakeup" as a distinct state.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: RE: on-ness
2006-04-24 21:04 ` David Brownell
@ 2006-04-24 21:32 ` Pavel Machek
2006-04-24 23:21 ` David Brownell
0 siblings, 1 reply; 23+ messages in thread
From: Pavel Machek @ 2006-04-24 21:32 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm, Dominik Brodowski
[-- Attachment #1: Type: text/plain, Size: 2097 bytes --]
On Po 24-04-06 14:04:40, David Brownell wrote:
>
> > > > > Uh. Not /sys/power/state. But /sys/devices/...../power/{[a],[b],[c]} where
> > > > > [a], [b] and [c] need sensible names.
> > > >
> > > > Well, "on" could have one defined meaning. Maybe it's the only option
> > > > available, until drivers add intelligence. I don't see any problem
> > > > with the other names being system-specific, since it's rather unlikely
> > > > that a PCI_D3hot state will ever appear on most embedded ARM boxes.
> > > > And if any userspace code tries to set power states, it had darn well
> > > > better understand exactly what's going on.
> > >
> > > Yes. However if a network managing userspace code wants to set the power
> > > conusmption of a WLAN device to the lowest possible setting, it shouldn't
> > > need a configuration file specific for each platform.
> >
> > I'd say that "on" and "off" are well defined.
>
> Are they? Does "off" imply the device will have been reset the next
> time it goes to "on"? If not, there would seem to be two "off" states.
> Or maybe more ... PCI_D0 is probably "on", but all of the other PCI
> device states seem to be variants of "off", not of "on".
I'd say "off" is as low as possible, perhaps including device reset.
> And for that matter, "on" doesn't seem to me to imply anything more
> than "full functionality from external POV". That doesn't necessarily
> imply "full power-on", and in fact it'd be better if it were using the
> lowest power state(s) available. That state might be compatible with
> certain system sleep states, or not, depending on the device's
> workload.
Agreed.
> > For certain classes (like ethernet), other states may be common
> > between platforms, too, like "off-with-WOL".
>
> Actually the wakeup characteristics are orthogonal, there are per-device
> bits controlling whether a device can and should do the wakeup. We don't
> for example treat "PCI_D3hot with wakeup" as a distinct state.
Ok, "off-with-WOL" was example. Hopefully there's better example.
Pavel
--
Thanks for all the (sleeping) penguins.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-21 15:27 ` David Brownell
2006-04-21 15:40 ` Dominik Brodowski
@ 2006-04-21 17:15 ` David Brownell
1 sibling, 0 replies; 23+ messages in thread
From: David Brownell @ 2006-04-21 17:15 UTC (permalink / raw)
To: linux-pm; +Cc: Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 1006 bytes --]
On Friday 21 April 2006 8:27 am, David Brownell wrote:
> And especially for PM, many driver writers are lazy.
Let me update that. The issue is less that the driver writers
are lazy, and more that the system PM infrastructure tends to not
work well on PCs, given issues with BIOS/ACPI/... and the Linux
code that talks to it.
In my own case, I don't even own a PC any more where Linux PM will
properly enter and exit (!!) true power-managed states like "standby"
or "suspend-to-RAM". So there is very little point in putting effort
into driver PM test/debug, beyond the basic test-through-sysfs stuff
(which may not suffice to shake loose bugs in STR paths etc).
On those rare platforms where Linux system infrastructure handles
power sanely -- including letting drivers get information about
the target system state, so suspend() can be smart enough -- the
only real issue with PM is testing. And that's no harder than any
other thing driver writers do; it can often be easier, in fact.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: RE: on-ness
@ 2006-04-21 17:58 Preece Scott-PREECE
2006-04-21 18:15 ` David Brownell
0 siblings, 1 reply; 23+ messages in thread
From: Preece Scott-PREECE @ 2006-04-21 17:58 UTC (permalink / raw)
To: Pavel Machek, Brown, Len; +Cc: David Brownell, linux-pm
[-- Attachment #1: Type: text/plain, Size: 2687 bytes --]
I think Pavel's points are important. Unless we're prepared to expose
exactly the ACPI semantics (and I don't think we are), we need to make
sure that the namespace is not easily confused with ACPI's namespace.
I would also prefer names to numbers, for the reasons already pointed
out - harder to confuse, less likely to be taken as an
arithmetically-interpretable ordered sequence, and easier to insert new
values into actual ordered sequences.
However, I also have to admit that I like the notion of the zero-state
meaning having a consistent meaning across the different atttributes and
it might be good to have a similarly consistent name for the
maximum/full-operation state across the attributes...
scott
-----Original Message-----
From: linux-pm-bounces@lists.osdl.org
[mailto:linux-pm-bounces@lists.osdl.org] On Behalf Of Pavel Machek
Sent: Thursday, April 20, 2006 8:26 AM
To: Brown, Len
Cc: David Brownell; linux-pm@lists.osdl.org
Subject: Re: [linux-pm] RE: on-ness
Hi!
> (restored linux-pm to cc:)
> no, if the numbers happen to corrolate to ACPI states, that is purely
> lucky for the ACPI maintainer:-) But honestly, we should "leverage"
> all we can from ACPI.
> My role, of course, it to make sure that the generic description can
> accommodate everything ACPI can do.
Please, put there translation layer from the begining. Otherwise people
will assme they *are* ACPI states, and great confusion will begin.
See suspend functions, where half people assumed they are acpi Dx
states, half people thought they are pci Dx states, and half the people
assumed they are system Sx states.
It took quite long to sort out.
> yes, idle refers to either Cx or Dx states -- it depends on the
> device. 0 means operating, non zero means non-operating.
>
> Re: strings.
> we breifly debated strings vs numbers.
> I'm not confident that there are enough unique strings available
> without falling into state0, state1, state2, state3 -- so numbers
> seemed simpler.
> The only issue I see with numbers is that they imply order, and it may
> be that some operating points might not have such a strict order.
Second issue is that you might want to "insert" between states.
Lets look at disk. Old disks had:
spinning
spindown
states. You'd name them 0 and 1, eventually apps will learn to use that.
But (some) newer disks have
spinning
slowspin
spindown
if app does echo 1 > state, it will get slowspin instead of spindown it
wanted.
Yes, inventing good names may be tricky in some cases, but in other
cases names are very natural (arm has sleep, deep sleep and big sleep,
iirc).... And we can always fall back to state0..state5.
Pavel
--
Thanks, Sharp!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-21 17:58 Preece Scott-PREECE
@ 2006-04-21 18:15 ` David Brownell
0 siblings, 0 replies; 23+ messages in thread
From: David Brownell @ 2006-04-21 18:15 UTC (permalink / raw)
To: Preece Scott-PREECE; +Cc: linux-pm, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 469 bytes --]
On Friday 21 April 2006 10:58 am, Preece Scott-PREECE wrote:
> However, I also have to admit that I like the notion of the zero-state
> meaning having a consistent meaning across the different atttributes and
> it might be good to have a similarly consistent name for the
> maximum/full-operation state across the attributes...
Well, not "zero-state" since that implies numbering. But sure,
maybe that's what "on" should mean ... for CPUs, devices, systems.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: RE: on-ness
@ 2006-04-24 21:32 Woodruff, Richard
2006-04-27 1:39 ` Patrick Mochel
2006-05-01 21:35 ` David Brownell
0 siblings, 2 replies; 23+ messages in thread
From: Woodruff, Richard @ 2006-04-24 21:32 UTC (permalink / raw)
To: David Brownell, linux-pm; +Cc: Dominik Brodowski
[-- Attachment #1: Type: text/plain, Size: 1967 bytes --]
> Are they? Does "off" imply the device will have been reset the next
> time it goes to "on"? If not, there would seem to be two "off"
states.
> Or maybe more ... PCI_D0 is probably "on", but all of the other PCI
> device states seem to be variants of "off", not of "on".
That would seem device specific. It would seem that applying a reset
when moving from "off" (especially that mapped to PCI_D3) would seem
reasonable. As I was told, the rest of the PCI defined states D1-D3 are
non-functional. So they are all OFF states. If just happens that in D3
you can safely physically remove the device. The current pseudo export
of these non-functional states doesn't seem so useful. Having some
fuzzy level of idleness seems much better.
The point of the idle states Len was floating attempts to define a level
of on-ness/idleness. As it turns out there was a nice correlation to
some ACPI states. Numbering them seems reasonable, else your just
wasting effort translating 'state0, state1, state2, ...', keeping the
naming convention simple '0,1,2,..0xffff' would seem to lend itself to
code sharing of necessary class/device specific translation code.
> Actually the wakeup characteristics are orthogonal, there are
per-device
> bits controlling whether a device can and should do the wakeup. We
don't
> for example treat "PCI_D3hot with wakeup" as a distinct state.
In my current hack attempts I am trying to associate a 'device' D2 state
with a local device sleep with an async wake up enabled. In this state
a devices accesses are locked out until the device wake up event
happens. This async wake can be from a device specific wake up or from
an associated timer resource. Current DPM enabled OMAP sprinkles
suspend lock outs inside of drivers which re-suspend queued waiters and
suspends new requests if the device is in a locked out state. From many
devices this creates a device which just reacts with a higher latency.
Regards,
Richard W.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-24 21:32 Woodruff, Richard
@ 2006-04-27 1:39 ` Patrick Mochel
2006-05-01 21:35 ` David Brownell
1 sibling, 0 replies; 23+ messages in thread
From: Patrick Mochel @ 2006-04-27 1:39 UTC (permalink / raw)
To: Woodruff, Richard; +Cc: David Brownell, linux-pm, Dominik Brodowski
[-- Attachment #1: Type: text/plain, Size: 3233 bytes --]
On Mon, Apr 24, 2006 at 04:32:54PM -0500, Woodruff, Richard wrote:
> > Are they? Does "off" imply the device will have been reset the next
> > time it goes to "on"? If not, there would seem to be two "off"
> states.
> > Or maybe more ... PCI_D0 is probably "on", but all of the other PCI
> > device states seem to be variants of "off", not of "on".
>
> That would seem device specific. It would seem that applying a reset
> when moving from "off" (especially that mapped to PCI_D3) would seem
> reasonable. As I was told, the rest of the PCI defined states D1-D3 are
> non-functional. So they are all OFF states.
Yes, they are all off in the sense that they are not operable. However,
there are definitely different levels of off-ness. When a device is in D3
then it transitions to D0, it is assumed to perform an internal device
reset.
Well, up until the PCI PM Spec 1.2, which adds a field to the PCI PM
config space called "NoSoftReset". If a device has that set, then it is
an indication that the device does not perform a soft reset (and there-
fore may not lose any state).
In D1 and D2, the devices will not lose as much state (though the amount
is device-specific), but more importantly, the device will not perform a
reset on the transition back to D0, meaning that we don't have to do a
full reinitialization.
[ The memory savings may be insignificant, but saving ourselves from the
process of reinitialziation is a big plus. For video devices, it's even
more compelling - the framebuffer may be large, a reinit may take a very
long time, and we may not even know how to do a full reinit. ]
> If just happens that in D3 you can safely physically remove the device.
That's not necessarily true, but it's moot for this thread anyway.
> The current pseudo export
> of these non-functional states doesn't seem so useful. Having some
> fuzzy level of idleness seems much better.
Maybe. What's more important is getting rid of the pseudo states. The
drivers should export exactly the states that they support, in whatever
fashion makes the most sense to them. For PCI devices that support basic
PM, this will be "D0" and "D3". PCI devices that support D1 and D2 will
export "D1" and "D2" as well. Devices that support other, device-
specific states will export meaningful names for them.
Different concepts of on-ness should be handled in a similar fashion. It
doesn't really matter if they are sensical strings or if they are digits,
it's all ASCII as it goes through sysfs, and it all has to get parsed,
mapped, and verified at some level.
> In my current hack attempts I am trying to associate a 'device' D2 state
> with a local device sleep with an async wake up enabled. In this state
> a devices accesses are locked out until the device wake up event
> happens. This async wake can be from a device specific wake up or from
> an associated timer resource. Current DPM enabled OMAP sprinkles
> suspend lock outs inside of drivers which re-suspend queued waiters and
> suspends new requests if the device is in a locked out state. From many
> devices this creates a device which just reacts with a higher latency.
That sounds promising. Got any patches that you care to post?
Thanks,
Patrick
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-24 21:32 Woodruff, Richard
2006-04-27 1:39 ` Patrick Mochel
@ 2006-05-01 21:35 ` David Brownell
1 sibling, 0 replies; 23+ messages in thread
From: David Brownell @ 2006-05-01 21:35 UTC (permalink / raw)
To: linux-pm; +Cc: Dominik Brodowski
[-- Attachment #1: Type: text/plain, Size: 2444 bytes --]
On Monday 24 April 2006 2:32 pm, Woodruff, Richard wrote:
> The current pseudo export
> of these non-functional states doesn't seem so useful. Having some
> fuzzy level of idleness seems much better.
I've come to believe that userspace has little or no business caring
about the details of power states ... except to the extent that a driver
exports them. After all, the driver might not support all the possible
hardware states. And the same controller on two different systems might
expose different power management functionality, based on (among other
things) what external hardware is wired up.
> The point of the idle states Len was floating attempts to define a level
> of on-ness/idleness. As it turns out there was a nice correlation to
> some ACPI states. Numbering them seems reasonable, else your just
> wasting effort translating 'state0, state1, state2, ...', keeping the
> naming convention simple '0,1,2,..0xffff' would seem to lend itself to
> code sharing of necessary class/device specific translation code.
I don't think I agree with "seem". We'd get right back into the mess
we now have ... not just expecting "2" to mean the same thing everywhere
(aargh!) but also finding needs for integer "1.5" that is somehow stuck
between "1" and "2". Much easier to do that with strings!
Plus, numbers lead to an assumption that things can be compared.
One is less than two ... but is an apple less than an orange?
In the same way, different power states are not necessarily
comparable on one single axis.
> > Actually the wakeup characteristics are orthogonal, there are
> > per-device bits controlling whether a device can and should do
> > the wakeup. We don't for example treat "PCI_D3hot with wakeup"
> > as a distinct state.
>
> In my current hack attempts I am trying to associate a 'device' D2 state
> with a local device sleep with an async wake up enabled. In this state
> a devices accesses are locked out until the device wake up event
> happens. This async wake can be from a device specific wake up or from
> an associated timer resource.
So would that be a "wake the whole system" or a runtime power
management mechanism to provide functionality without very much
of a power drain? I've done the latter a few times, but never
needed to expose it through sysfs. If it works, it kind of needs
to be enabled at almost all times, to avoid nightmares of config
management and debugging.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
@ 2006-04-27 14:12 Scott E. Preece
2006-04-27 17:01 ` Patrick Mochel
2006-05-01 21:58 ` David Brownell
0 siblings, 2 replies; 23+ messages in thread
From: Scott E. Preece @ 2006-04-27 14:12 UTC (permalink / raw)
To: mochel; +Cc: david-b, linux-pm, linux
[-- Attachment #1: Type: text/plain, Size: 5015 bytes --]
Let me recast the question a little.
Quite aside from the utility of having names that are meaningful to a
human reader unfamiliar with a particular device, is the problem of
supporting a system-level power policy on top of devices that have
different device-level power states.
So, is the sum of this conversation to this point that it simply isn't
possible to come up with a set of names and attributes that are
meaningful across devices? Or might it be possible to map the set of
special conditions (like the "NoSoftReset" below) to a common vocabulary
that a device could expose to power management and that a generic,
cross-platform power management facility could map to system states and
transitions?
In my domain (consumer devices) it's not such a big deal, because we
pick the devices and can write code [albeit with some effort that we
would rather not expend] to control each device appropriately in the
context of the system's projected use cases. However, even in our domain
we're beginning to need to deal with USB OTG devices being added, and it
would be useful to be able to handle them at least somewhat
intelligently based on attributes that they expose.
scott
| On Mon, Apr 24, 2006 at 04:32:54PM -0500, Woodruff, Richard wrote:
| > > Are they? Does "off" imply the device will have been reset the next
| > > time it goes to "on"? If not, there would seem to be two "off"
| > states.
| > > Or maybe more ... PCI_D0 is probably "on", but all of the other PCI
| > > device states seem to be variants of "off", not of "on".
| >
| > That would seem device specific. It would seem that applying a reset
| > when moving from "off" (especially that mapped to PCI_D3) would seem
| > reasonable. As I was told, the rest of the PCI defined states D1-D3 are
| > non-functional. So they are all OFF states.
|
| Yes, they are all off in the sense that they are not operable. However,
| there are definitely different levels of off-ness. When a device is in D3
| then it transitions to D0, it is assumed to perform an internal device
| reset.
|
| Well, up until the PCI PM Spec 1.2, which adds a field to the PCI PM
| config space called "NoSoftReset". If a device has that set, then it is
| an indication that the device does not perform a soft reset (and there-
| fore may not lose any state).
|
| In D1 and D2, the devices will not lose as much state (though the amount
| is device-specific), but more importantly, the device will not perform a
| reset on the transition back to D0, meaning that we don't have to do a
| full reinitialization.
|
| [ The memory savings may be insignificant, but saving ourselves from the
| process of reinitialziation is a big plus. For video devices, it's even
| more compelling - the framebuffer may be large, a reinit may take a very
| long time, and we may not even know how to do a full reinit. ]
|
| > If just happens that in D3 you can safely physically remove the device.
|
| That's not necessarily true, but it's moot for this thread anyway.
|
| > The current pseudo export
| > of these non-functional states doesn't seem so useful. Having some
| > fuzzy level of idleness seems much better.
|
| Maybe. What's more important is getting rid of the pseudo states. The
| drivers should export exactly the states that they support, in whatever
| fashion makes the most sense to them. For PCI devices that support basic
| PM, this will be "D0" and "D3". PCI devices that support D1 and D2 will
| export "D1" and "D2" as well. Devices that support other, device-
| specific states will export meaningful names for them.
|
| Different concepts of on-ness should be handled in a similar fashion. It
| doesn't really matter if they are sensical strings or if they are digits,
| it's all ASCII as it goes through sysfs, and it all has to get parsed,
| mapped, and verified at some level.
|
| > In my current hack attempts I am trying to associate a 'device' D2 state
| > with a local device sleep with an async wake up enabled. In this state
| > a devices accesses are locked out until the device wake up event
| > happens. This async wake can be from a device specific wake up or from
| > an associated timer resource. Current DPM enabled OMAP sprinkles
| > suspend lock outs inside of drivers which re-suspend queued waiters and
| > suspends new requests if the device is in a locked out state. From many
| > devices this creates a device which just reacts with a higher latency.
|
| That sounds promising. Got any patches that you care to post?
|
| Thanks,
|
|
| Patrick
|
| --===============74204344604701977==
| ----------
| _______________________________________________
| linux-pm mailing list
| linux-pm@lists.osdl.org
| https://lists.osdl.org/mailman/listinfo/linux-pm
|
| --===============74204344604701977==--
--
scott preece
motorola mobile devices, il67, 1800 s. oak st., champaign, il 61820
e-mail: preece@motorola.com fax: +1-217-384-8550
phone: +1-217-384-8589 cell: +1-217-433-6114 pager: 2174336114@vtext.com
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-27 14:12 Scott E. Preece
@ 2006-04-27 17:01 ` Patrick Mochel
2006-05-01 21:58 ` David Brownell
1 sibling, 0 replies; 23+ messages in thread
From: Patrick Mochel @ 2006-04-27 17:01 UTC (permalink / raw)
To: Scott E. Preece; +Cc: david-b, linux-pm, linux
[-- Attachment #1: Type: text/plain, Size: 1163 bytes --]
On Thu, Apr 27, 2006 at 09:12:31AM -0500, Scott E. Preece wrote:
> So, is the sum of this conversation to this point that it simply isn't
> possible to come up with a set of names and attributes that are
> meaningful across devices? Or might it be possible to map the set of
> special conditions (like the "NoSoftReset" below) to a common vocabulary
> that a device could expose to power management and that a generic,
> cross-platform power management facility could map to system states and
> transitions?
Both. :-)
You can come up with some names and attributes that are common across
devices, and those should be leveraged when possible. But, I don't think
it's possible, or worthwhile, to try to map every device state to a common,
generic (i.e. limited) vocabulary.
You want drivers to export the states that they know about, in the format
that makes the most to them (e.g. PCI D0-D3), instead of having them
scratch their head about what to name the states.
>From that point (when they're exported), it's pretty simple to condense the
vocbulary into something that makes more sense for the platform (like, "on",
"off", "more off", etc).
Patrick
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: RE: on-ness
2006-04-27 14:12 Scott E. Preece
2006-04-27 17:01 ` Patrick Mochel
@ 2006-05-01 21:58 ` David Brownell
1 sibling, 0 replies; 23+ messages in thread
From: David Brownell @ 2006-05-01 21:58 UTC (permalink / raw)
To: Scott E. Preece; +Cc: linux-pm, linux
[-- Attachment #1: Type: text/plain, Size: 2859 bytes --]
On Thursday 27 April 2006 7:12 am, Scott E. Preece wrote:
>
> Let me recast the question a little.
>
> Quite aside from the utility of having names that are meaningful to a
> human reader unfamiliar with a particular device, is the problem of
> supporting a system-level power policy on top of devices that have
> different device-level power states.
Related: handling the interactions between system and device power
states. System states commonly constrain device states, e.g. by rules
like "clocks X, Y, and Z are unavailable in system states B and C".
I have an API proposal for that particular problem, but there are
similar ones in other areas, like the available power. (Maybe some
of the supplies have less power available -- or none! -- or switch to
lower voltage modes.)
> So, is the sum of this conversation to this point that it simply isn't
> possible to come up with a set of names and attributes that are
> meaningful across devices?
Possible is one thing; you can always define an ever-growing set of
attributes. But would it be useful ... or a nightmare to manage,
when scaling over all platforms that Linux handles? I lean towards
the latter.
> Or might it be possible to map the set of
> special conditions (like the "NoSoftReset" below) to a common vocabulary
> that a device could expose to power management and that a generic,
> cross-platform power management facility could map to system states and
> transitions?
There will be some common features, sure, but I'm skeptical about the
notion of a generic cross-platform wunderfacility.
And there are other options. My current favorite is still to expose
device-specific power states purely for test/debug, and expect the
kernel to handle everything correctly.
(We do need a better notion of drivers interacting with a system-wide
power manager. Currently there IS no such notion, and it's a huge
hole.)
> In my domain (consumer devices) it's not such a big deal, because we
> pick the devices and can write code [albeit with some effort that we
> would rather not expend] to control each device appropriately in the
> context of the system's projected use cases. However, even in our domain
> we're beginning to need to deal with USB OTG devices being added, and it
> would be useful to be able to handle them at least somewhat
> intelligently based on attributes that they expose.
Heh. OTG, yes. I've put some thought into that. There happen to be
a few different models to consider ... for example, sometimes there are
separate controllers for host and peripheral roles, as well as an OTG
controller coupled to a transceiver (maybe external/interchangeable);
and sometimes it's all integrated (e.g. host vs peripheral is just
different modes working with the same FIFO/SIE silicon).
Again, I don't see any need to expose a userspace API for those.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2006-05-01 21:58 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-18 18:39 RE: on-ness Brown, Len
2006-04-20 13:25 ` Pavel Machek
2006-04-21 15:27 ` David Brownell
2006-04-21 15:40 ` Dominik Brodowski
2006-04-21 17:03 ` David Brownell
2006-04-21 17:12 ` Dominik Brodowski
2006-04-21 18:30 ` David Brownell
2006-04-21 18:33 ` Dominik Brodowski
2006-04-21 19:00 ` David Brownell
2006-04-21 19:04 ` [OT] debugfs and sysfs [Was: Re: RE: on-ness] Dominik Brodowski
2006-04-21 19:01 ` RE: on-ness Pavel Machek
2006-04-24 21:04 ` David Brownell
2006-04-24 21:32 ` Pavel Machek
2006-04-24 23:21 ` David Brownell
2006-04-21 17:15 ` David Brownell
-- strict thread matches above, loose matches on Subject: below --
2006-04-21 17:58 Preece Scott-PREECE
2006-04-21 18:15 ` David Brownell
2006-04-24 21:32 Woodruff, Richard
2006-04-27 1:39 ` Patrick Mochel
2006-05-01 21:35 ` David Brownell
2006-04-27 14:12 Scott E. Preece
2006-04-27 17:01 ` Patrick Mochel
2006-05-01 21:58 ` David Brownell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox