* Nested suspends; messages vs. states
@ 2005-03-21 20:11 Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503211436020.1241-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Alan Stern @ 2005-03-21 20:11 UTC (permalink / raw)
To: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3162 bytes --]
Here are a couple of issues I want to raise before the next IRC session.
Nested suspends: We know that the PM core tries to avoid increasing a
device's suspend level (i.e., FREEZE -> SUSPEND) as part of a system
sleep. However... The core won't have a very good idea of a device's
initial state, and a device may already be suspended when the system sleep
begins. We have decided that devices' power states are represented by
pointers to structures defined at the bus or device level; the PM core
won't know how to interpret them. So it won't know whether a device is
already suspended.
There's also the possibility that as part of runtime power management, a
user might tell an already-suspended device to go to a different, but
still suspended, power state. The core can't filter out such requests
because it doesn't understand the states. It's not even clear that such
requests _should_ be filtered out. PM-aware PCI devices, for example,
have no trouble moving from D1 to D2.
The simplest way of handling this is to allow explicitly for such
possibilities. When a device is asked to go from a very-low-power state
to a slightly-low-power state, it should be legal for the driver to leave
it in the very-low-power state. It should also be legal for the driver to
go to full power temporarily, then down to the requested power level. In
particular, if a device is already suspended then it should be okay for
the driver to do nothing and still return Success for a FREEZE or SUSPEND
request -- and this fact should be documented.
Another way to handle this is to include a generic "low power" flag as a
standard part of the new power-state structures. That way the core would
at least know whether a device was at full power. (Maybe include a
"quiescent" flag too, since some devices can be operational while at low
power.) While this isn't a bad idea, I rather favor the other approach.
of course we can always do both.
Messages vs. states: At the moment the PM core seems to be pretty
confused over this distinction. Right in the definition of struct
dev_pm_info we have:
pm_message_t power_state;
Obviously a message isn't the same thing as a state. This looks like
something that will need to be changed in a lot of drivers when we
introduce the new notion of a power state.
As a corollary we have the problem of what to include in the argument
passed to a suspend callback. It should be a message, clearly, and
part of the message should be an indication of which state to go to. The
question is, how is this state represented? For device power management
we will want to provide a genuine power state (i.e., pointer to bus- or
device-specific structure). For system power management we will want to
provide a generic code -- PMSG_ON, PMSG_FREEZE, or PMSG_SUSPEND -- which
the driver will map to a real power state.
It seems to me the best way to do this is to let pm_message_t include both
a generic code and a power-state pointer. There should be a new code
added (PMSG_RUNTIME? or maybe PMSG_DEVICE?), meaning that the driver
should use the state pointer. Otherwise the driver maps the generic code.
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503211436020.1241-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-21 20:20 ` Pavel Machek
[not found] ` <20050321202016.GI1390-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-22 4:21 ` Benjamin Herrenschmidt
2005-03-23 0:52 ` Patrick Mochel
2 siblings, 1 reply; 72+ messages in thread
From: Pavel Machek @ 2005-03-21 20:20 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 722 bytes --]
Hi!
> Messages vs. states: At the moment the PM core seems to be pretty
> confused over this distinction. Right in the definition of struct
> dev_pm_info we have:
>
> pm_message_t power_state;
>
> Obviously a message isn't the same thing as a state. This looks like
> something that will need to be changed in a lot of drivers when we
> introduce the new notion of a power state.
This is not so obvious to me. Message seems to represent the state
driver is in quite well... Plus we need PMSG_ON for state the device
gets after resume, but that's quite easy...
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <20050321202016.GI1390-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-21 21:14 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503211613010.2329-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Alan Stern @ 2005-03-21 21:14 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 728 bytes --]
On Mon, 21 Mar 2005, Pavel Machek wrote:
> Hi!
>
> > Messages vs. states: At the moment the PM core seems to be pretty
> > confused over this distinction. Right in the definition of struct
> > dev_pm_info we have:
> >
> > pm_message_t power_state;
> >
> > Obviously a message isn't the same thing as a state. This looks like
> > something that will need to be changed in a lot of drivers when we
> > introduce the new notion of a power state.
>
> This is not so obvious to me. Message seems to represent the state
> driver is in quite well... Plus we need PMSG_ON for state the device
> gets after resume, but that's quite easy...
So what value for power_state do you use to tell apart PCI D1 from D2?
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503211613010.2329-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-21 22:26 ` Pavel Machek
[not found] ` <20050321222609.GK1390-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Pavel Machek @ 2005-03-21 22:26 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]
On Po 21-03-05 16:14:09, Alan Stern wrote:
> On Mon, 21 Mar 2005, Pavel Machek wrote:
>
> > Hi!
> >
> > > Messages vs. states: At the moment the PM core seems to be pretty
> > > confused over this distinction. Right in the definition of struct
> > > dev_pm_info we have:
> > >
> > > pm_message_t power_state;
> > >
> > > Obviously a message isn't the same thing as a state. This looks like
> > > something that will need to be changed in a lot of drivers when we
> > > introduce the new notion of a power state.
> >
> > This is not so obvious to me. Message seems to represent the state
> > driver is in quite well... Plus we need PMSG_ON for state the device
> > gets after resume, but that's quite easy...
>
> So what value for power_state do you use to tell apart PCI D1 from D2?
You just ask via pci_choose_state(), and it tells
you... pci_choose_state() should be deterministic.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <20050321222609.GK1390-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-22 3:08 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503212140450.28689-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
2005-03-23 18:32 ` David Brownell
1 sibling, 1 reply; 72+ messages in thread
From: Alan Stern @ 2005-03-22 3:08 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1258 bytes --]
On Mon, 21 Mar 2005, Pavel Machek wrote:
> > > > Obviously a message isn't the same thing as a state. This looks like
> > > > something that will need to be changed in a lot of drivers when we
> > > > introduce the new notion of a power state.
> > >
> > > This is not so obvious to me. Message seems to represent the state
> > > driver is in quite well... Plus we need PMSG_ON for state the device
> > > gets after resume, but that's quite easy...
> >
> > So what value for power_state do you use to tell apart PCI D1 from D2?
>
> You just ask via pci_choose_state(), and it tells
> you... pci_choose_state() should be deterministic.
That's useless for sysfs. It won't know to call pci_choose_state when it
has to display the device's current power state in an attribute file. Nor
will it know what pm_message_t to send the driver when the user writes the
string "D2" to the attribute.
Even just from first principles the mistake is apparent. pm_message_t is
(or will be, when the structure is defined in its final form) a _message_,
not a _state_. It contains (will contain) things other than the power
state setting, such as the "flags" field. Why would a device want to
store pm_message_t.flags as part of its current state?
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503211436020.1241-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-21 20:20 ` Pavel Machek
@ 2005-03-22 4:21 ` Benjamin Herrenschmidt
2005-03-22 17:04 ` Alan Stern
2005-03-23 18:58 ` David Brownell
2005-03-23 0:52 ` Patrick Mochel
2 siblings, 2 replies; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-22 4:21 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 7688 bytes --]
On Mon, 2005-03-21 at 15:11 -0500, Alan Stern wrote:
> Here are a couple of issues I want to raise before the next IRC session.
>
>
> Nested suspends: We know that the PM core tries to avoid increasing a
> device's suspend level (i.e., FREEZE -> SUSPEND) as part of a system
> sleep. However... The core won't have a very good idea of a device's
> initial state, and a device may already be suspended when the system sleep
> begins. We have decided that devices' power states are represented by
> pointers to structures defined at the bus or device level; the PM core
> won't know how to interpret them. So it won't know whether a device is
> already suspended.
Yup, the model we are desinging now should allow for arbitrary
transitions I suppose as long as the target state is legal vs. the
dependencies. Also, a system "suspend" state for example is enforced by
the system and a user shouldnt be allowed to change it unless the system
has resumed (thinking here about a spurrious user change coming in after
the suspend call). I suppose the drivers should have some mean of flags
in the message telling if this is a user initiated transition, or a
system initiated transition (hrm... or rather, wether it's initiated by
the user directly on this device, or is the result of a state change
about to happen at the parent level).
> There's also the possibility that as part of runtime power management, a
> user might tell an already-suspended device to go to a different, but
> still suspended, power state. The core can't filter out such requests
> because it doesn't understand the states. It's not even clear that such
> requests _should_ be filtered out. PM-aware PCI devices, for example,
> have no trouble moving from D1 to D2.
The drivers are the only to know what is legal I suppose...
> The simplest way of handling this is to allow explicitly for such
> possibilities. When a device is asked to go from a very-low-power state
> to a slightly-low-power state, it should be legal for the driver to leave
> it in the very-low-power state.
Well... I'm not sure about that one. If the power states represent some
performance states, the system may want to raise the performance a bit
at the expensve of power and would stay low perf unless a full
transition to state "full on" is done ?
I suspect it's a per driver responsibility here. I suppose common sense
will dicate what can be allowed and what not.
One thing is some states may be transitory. This is also a flag in the
message I suppose. A system state is "permanent" in the sense that only
a system wakeup will undo a system suspend. But a driver originated
(idle timer) suspend need to trigger an auto-wakeup of the driver. I
suspect that in most cases, user originated states are that way too: If
the user explicitely suspends his HD (puts it to SLEEP) via /sysfs
(which can be represented by some kind of GUI think in gnome or KDE
panel, like MacOS used to do), he still wants the drive to spin up again
as soon as a request gets there.
But then again, that is mostly per-driver policy driven by common sense.
A system state is enforceable, a user state may not be...
> It should also be legal for the driver to
> go to full power temporarily, then down to the requested power level.
Yes.
> In particular, if a device is already suspended then it should be okay for
> the driver to do nothing and still return Success for a FREEZE or SUSPEND
> request -- and this fact should be documented.
Possibly, but is the state actually changed ? If the driver has a state
"suspend" and gets a "freeze" request, does it stay in "suspend" state ?
If the driver is in a user-originated or idle-originated "suspend" state
(with auto-wakeup on activity) and gets a FREEZE (or another SUSPEND)
from the system, it must make sure not to auto-wakeup anymore from that
state. There is a bit of policy to implement here, and I'm not sure how
much of that can be put in the core to help drivers, and how much has to
remain driver specific.
The goal is to be as simple as possible or driverrs will never get it
right, _BUT_ on the other hands, this is a complex problem and we can
probably not hide all of the difficulties. At one point, drivers will
have problems that will have to be fixed on a per driver basis.
> Another way to handle this is to include a generic "low power" flag as a
> standard part of the new power-state structures. That way the core would
> at least know whether a device was at full power. (Maybe include a
> "quiescent" flag too, since some devices can be operational while at low
> power.) While this isn't a bad idea, I rather favor the other approach.
> of course we can always do both.
I'm not sure... Do we care ? Just tell the driver and see what it does,
the driver doesn't have to go to the state we requested I suppose.
One thing that is important if we deal with partial suspend and tree
dependencies is the ordering...
When a device is asked to enter a given state, the dependencies of the
childs has to be checked in a different order if we are going to lower
power than if we are going to higher power.
If going to lower power (that is toward suspend), we must check the
dependencies of childs and eventually low-power them before the parent
is actually state changed.
If going to higher power, it is the opposite.
However, if the driver goes to a different state, it must go to a state
that doesn't break that rule. If the parent is asked to go to a deeper
state, that means it's childs will have already been put to a deeper
state to match the dependency of the new state. That means the driver
must not go to a state that breaks that dependency. It can go to a
less-deep state than asked but can't go to a deeper one since the childs
may not be ready for it. Same goes in the opposite direction.
So I think we need to have the states in some sort of order at least so
the core has a notion of what is "lower" and what is "higher" power to
deal with that. Though I suppose we could also have optional hooks in
driver (pre-parent-change and post-parent-change) for driver who want to
be sneaky, but that gets nasty and complicated.
Messages vs. states: At the moment the PM core seems to be pretty
> confused over this distinction. Right in the definition of struct
> dev_pm_info we have:
>
> pm_message_t power_state;
>
> Obviously a message isn't the same thing as a state. This looks like
> something that will need to be changed in a lot of drivers when we
> introduce the new notion of a power state.
>
> As a corollary we have the problem of what to include in the argument
> passed to a suspend callback. It should be a message, clearly, and
> part of the message should be an indication of which state to go to. The
> question is, how is this state represented? For device power management
> we will want to provide a genuine power state (i.e., pointer to bus- or
> device-specific structure). For system power management we will want to
> provide a generic code -- PMSG_ON, PMSG_FREEZE, or PMSG_SUSPEND -- which
> the driver will map to a real power state.
>
> It seems to me the best way to do this is to let pm_message_t include both
> a generic code and a power-state pointer. There should be a new code
> added (PMSG_RUNTIME? or maybe PMSG_DEVICE?), meaning that the driver
> should use the state pointer. Otherwise the driver maps the generic code.
>
> Alan Stern
>
> _______________________________________________
> linux-pm mailing list
> linux-pm-qjLDD68F18O7TbgM5vRIOg@public.gmane.org
> http://lists.osdl.org/mailman/listinfo/linux-pm
--
Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503212140450.28689-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2005-03-22 11:08 ` Pavel Machek
[not found] ` <20050322110802.GA1751-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Pavel Machek @ 2005-03-22 11:08 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1525 bytes --]
Hi!
> > > > > Obviously a message isn't the same thing as a state. This looks like
> > > > > something that will need to be changed in a lot of drivers when we
> > > > > introduce the new notion of a power state.
> > > >
> > > > This is not so obvious to me. Message seems to represent the state
> > > > driver is in quite well... Plus we need PMSG_ON for state the device
> > > > gets after resume, but that's quite easy...
> > >
> > > So what value for power_state do you use to tell apart PCI D1 from D2?
> >
> > You just ask via pci_choose_state(), and it tells
> > you... pci_choose_state() should be deterministic.
>
> That's useless for sysfs. It won't know to call pci_choose_state when it
> has to display the device's current power state in an attribute file. Nor
> will it know what pm_message_t to send the driver when the user writes the
> string "D2" to the attribute.
And do we really want user writing D2 to /sys file?
> Even just from first principles the mistake is apparent. pm_message_t is
> (or will be, when the structure is defined in its final form) a _message_,
> not a _state_. It contains (will contain) things other than the power
> state setting, such as the "flags" field. Why would a device want to
> store pm_message_t.flags as part of its current state?
Because device may enter different hw states for different flags?
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-22 4:21 ` Benjamin Herrenschmidt
@ 2005-03-22 17:04 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503221143460.954-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-23 18:58 ` David Brownell
1 sibling, 1 reply; 72+ messages in thread
From: Alan Stern @ 2005-03-22 17:04 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 6364 bytes --]
On Tue, 22 Mar 2005, Benjamin Herrenschmidt wrote:
> Yup, the model we are desinging now should allow for arbitrary
> transitions I suppose as long as the target state is legal vs. the
> dependencies. Also, a system "suspend" state for example is enforced by
> the system and a user shouldnt be allowed to change it unless the system
> has resumed (thinking here about a spurrious user change coming in after
> the suspend call).
When per-device locking gets added to the driver model, this can be
handled by making the PM core lock all devices before starting STR and
unlock them all after waking up. Then any user process trying to resume a
device in the middle will block until the system is fully awake.
> I suppose the drivers should have some mean of flags
> in the message telling if this is a user initiated transition, or a
> system initiated transition (hrm... or rather, wether it's initiated by
> the user directly on this device, or is the result of a state change
> about to happen at the parent level).
My suggestion was to use a new code, PMSG_RUNTIME or something like that,
for suspend calls coming from the user. Are we okay with no equivalent
code for resume calls?
> > The simplest way of handling this is to allow explicitly for such
> > possibilities. When a device is asked to go from a very-low-power state
> > to a slightly-low-power state, it should be legal for the driver to leave
> > it in the very-low-power state.
>
> Well... I'm not sure about that one. If the power states represent some
> performance states, the system may want to raise the performance a bit
> at the expensve of power and would stay low perf unless a full
> transition to state "full on" is done ?
If the system wanted to raise the performance, it will be safest to
detour through "full power" on the way. If the user (or a userspace
policy manager) skips the "full power" step, they get what they deserve.
> I suspect it's a per driver responsibility here. I suppose common sense
> will dicate what can be allowed and what not.
Yes.
> One thing is some states may be transitory. This is also a flag in the
> message I suppose. A system state is "permanent" in the sense that only
> a system wakeup will undo a system suspend. But a driver originated
> (idle timer) suspend need to trigger an auto-wakeup of the driver. I
> suspect that in most cases, user originated states are that way too: If
> the user explicitely suspends his HD (puts it to SLEEP) via /sysfs
> (which can be represented by some kind of GUI think in gnome or KDE
> panel, like MacOS used to do), he still wants the drive to spin up again
> as soon as a request gets there.
>
> But then again, that is mostly per-driver policy driven by common sense.
> A system state is enforceable, a user state may not be...
We can enforce the system sleep states by device locking as described
above. For STD no enforcement is needed, because no processes other than
the PM thread will be running. (Except for things with PF_NOFREEZE --
they are in a position to cause some trouble.)
> > In particular, if a device is already suspended then it should be okay for
> > the driver to do nothing and still return Success for a FREEZE or SUSPEND
> > request -- and this fact should be documented.
>
> Possibly, but is the state actually changed ? If the driver has a state
> "suspend" and gets a "freeze" request, does it stay in "suspend" state ?
Up to the driver. The only requirement for FREEZE is that the device must
be quiesced; the actual state doesn't matter.
> If the driver is in a user-originated or idle-originated "suspend" state
> (with auto-wakeup on activity) and gets a FREEZE (or another SUSPEND)
> from the system, it must make sure not to auto-wakeup anymore from that
> state. There is a bit of policy to implement here, and I'm not sure how
> much of that can be put in the core to help drivers, and how much has to
> remain driver specific.
Locking should take care of this, once it's available.
> > Another way to handle this is to include a generic "low power" flag as a
> > standard part of the new power-state structures. That way the core would
> > at least know whether a device was at full power. (Maybe include a
> > "quiescent" flag too, since some devices can be operational while at low
> > power.) While this isn't a bad idea, I rather favor the other approach.
> > of course we can always do both.
>
> I'm not sure... Do we care ? Just tell the driver and see what it does,
> the driver doesn't have to go to the state we requested I suppose.
I'm not sure either. I guess we shouldn't worry about adding these flags
unless it becomes clear that they are needed.
> One thing that is important if we deal with partial suspend and tree
> dependencies is the ordering...
>
> When a device is asked to enter a given state, the dependencies of the
> childs has to be checked in a different order if we are going to lower
> power than if we are going to higher power.
>
> If going to lower power (that is toward suspend), we must check the
> dependencies of childs and eventually low-power them before the parent
> is actually state changed.
>
> If going to higher power, it is the opposite.
>
> However, if the driver goes to a different state, it must go to a state
> that doesn't break that rule. If the parent is asked to go to a deeper
> state, that means it's childs will have already been put to a deeper
> state to match the dependency of the new state. That means the driver
> must not go to a state that breaks that dependency. It can go to a
> less-deep state than asked but can't go to a deeper one since the childs
> may not be ready for it. Same goes in the opposite direction.
>
> So I think we need to have the states in some sort of order at least so
> the core has a notion of what is "lower" and what is "higher" power to
> deal with that. Though I suppose we could also have optional hooks in
> driver (pre-parent-change and post-parent-change) for driver who want to
> be sneaky, but that gets nasty and complicated.
I agree. This is the sort of boilerplate computation that is best done in
one single place -- the PM core. Unfortunately it means that the core has
to understand what combinations of parent-state/child-state are legal. I
don't know how that knowledge can best be represented.
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <20050322110802.GA1751-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-22 17:24 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503221216430.954-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Alan Stern @ 2005-03-22 17:24 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1087 bytes --]
On Tue, 22 Mar 2005, Pavel Machek wrote:
> And do we really want user writing D2 to /sys file?
Yes, absolutely. And we want the power/state file to contain "D2" when
a PCI device is in that state.
> > Even just from first principles the mistake is apparent. pm_message_t is
> > (or will be, when the structure is defined in its final form) a _message_,
> > not a _state_. It contains (will contain) things other than the power
> > state setting, such as the "flags" field. Why would a device want to
> > store pm_message_t.flags as part of its current state?
>
> Because device may enter different hw states for different flags?
But once the device is in a particular state, the reason why it entered
that state doesn't matter any more. Certainly it shouldn't be _part_ of
the state.
Consider this: Device states are bus- or device-specific structures, as
discussed before. But the PM core can export a set of minimal generic
state structures for use by drivers that don't need anything more
complicated than On, Frozen, or Suspended. How does that sound?
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503221143460.954-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-22 23:36 ` Benjamin Herrenschmidt
2005-03-23 1:17 ` Patrick Mochel
1 sibling, 0 replies; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-22 23:36 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 2682 bytes --]
On Tue, 2005-03-22 at 12:04 -0500, Alan Stern wrote:
> We can enforce the system sleep states by device locking as described
> above. For STD no enforcement is needed, because no processes other than
> the PM thread will be running. (Except for things with PF_NOFREEZE --
> they are in a position to cause some trouble.)
I'm not fan of this "locking" concept... Well, we are effectively
locking the PM state of the device, true, but I'm not sure it should act
as a lock, it should rather reject transitions no ? Or maybe you are
right, and we should just have a high level lock set when starting
STD/STR that blocks any user originated action ?
>
> > If the driver is in a user-originated or idle-originated "suspend" state
> > (with auto-wakeup on activity) and gets a FREEZE (or another SUSPEND)
> > from the system, it must make sure not to auto-wakeup anymore from that
> > state. There is a bit of policy to implement here, and I'm not sure how
> > much of that can be put in the core to help drivers, and how much has to
> > remain driver specific.
>
> Locking should take care of this, once it's available.
I'm not sure, but maybe ... I need to think about this "locking" concept
a bit more.
> I'm not sure either. I guess we shouldn't worry about adding these flags
> unless it becomes clear that they are needed.
Agreed.
> > So I think we need to have the states in some sort of order at least so
> > the core has a notion of what is "lower" and what is "higher" power to
> > deal with that. Though I suppose we could also have optional hooks in
> > driver (pre-parent-change and post-parent-change) for driver who want to
> > be sneaky, but that gets nasty and complicated.
>
> I agree. This is the sort of boilerplate computation that is best done in
> one single place -- the PM core. Unfortunately it means that the core has
> to understand what combinations of parent-state/child-state are legal. I
> don't know how that knowledge can best be represented.
Well, I had this idea of expressing dependencies with bitmasks of
states, that is a device would express for each state what parent states
it is compatible with (since parent states are de-facto bus-states, it
is ok for a device to know the semantics of the bus it's living on. PCI
busses (and thus parent -> pci bridges) will have a well defined set of
states, only leaf devices can have 'fancy' states).
EIther that, or we need more callbacks for validating states, but I'm
afraid that would become a mess... Most devices will only have 2 states
anyway, and we can probably provide shortcut macros for defining simple
2 state tables which will make things easy for most drivers.
Ben.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503211436020.1241-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-21 20:20 ` Pavel Machek
2005-03-22 4:21 ` Benjamin Herrenschmidt
@ 2005-03-23 0:52 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503221635130.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-23 0:52 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3756 bytes --]
On Mon, 21 Mar 2005, Alan Stern wrote:
> Here are a couple of issues I want to raise before the next IRC session.
>
>
> Nested suspends: We know that the PM core tries to avoid increasing a
> device's suspend level (i.e., FREEZE -> SUSPEND) as part of a system
> sleep. However... The core won't have a very good idea of a device's
> initial state, and a device may already be suspended when the system sleep
> begins. We have decided that devices' power states are represented by
> pointers to structures defined at the bus or device level; the PM core
> won't know how to interpret them. So it won't know whether a device is
> already suspended.
I think the core should always call ->suspend() for a device, regardless
of whether it thinks it's in a low power state, or inactive. This is
specifically for the reason that a device could be a low-power runtime
state when the system is suspended.
To assist this, there should probably be only 1 list to hold the PM nodes,
making the code a lot simpler.
> There's also the possibility that as part of runtime power management, a
> user might tell an already-suspended device to go to a different, but
> still suspended, power state. The core can't filter out such requests
> because it doesn't understand the states. It's not even clear that such
> requests _should_ be filtered out. PM-aware PCI devices, for example,
> have no trouble moving from D1 to D2.
Agree.
> Another way to handle this is to include a generic "low power" flag as a
> standard part of the new power-state structures. That way the core would
> at least know whether a device was at full power. (Maybe include a
> "quiescent" flag too, since some devices can be operational while at low
> power.) While this isn't a bad idea, I rather favor the other approach.
> of course we can always do both.
I don't think the core needs to know. It shouldn't care when traversing
the lists what state a device is in.
> Messages vs. states: At the moment the PM core seems to be pretty
> confused over this distinction. Right in the definition of struct
> dev_pm_info we have:
>
> pm_message_t power_state;
>
> Obviously a message isn't the same thing as a state. This looks like
> something that will need to be changed in a lot of drivers when we
> introduce the new notion of a power state.
>
> As a corollary we have the problem of what to include in the argument
> passed to a suspend callback. It should be a message, clearly, and
> part of the message should be an indication of which state to go to. The
> question is, how is this state represented? For device power management
> we will want to provide a genuine power state (i.e., pointer to bus- or
> device-specific structure). For system power management we will want to
> provide a generic code -- PMSG_ON, PMSG_FREEZE, or PMSG_SUSPEND -- which
> the driver will map to a real power state.
The system state transitions should be mediated through the bus driver.
The bus should then translate it and call the driver, which will then map
it back to a device-specific state (optionally using some bus helpers).
We've talked about this before; I think David, Ben and I have all proposed
something like this. Are you suggesting something different?
For runtime states, I think the bus should be the one exporting PM control
files through sysfs, not the driver core. It will handle the display and
setting of power states, allowing it to show userspace states that
actually mean something, rather than just arbitrary numbers that don't map
to every bus.
[ To show these states, the bus should use a per-device array of states
that is filled in by the driver. This could easily include device-specific
states that the driver includes in the array. ]
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503221143460.954-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-22 23:36 ` Benjamin Herrenschmidt
@ 2005-03-23 1:17 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503221709080.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
1 sibling, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-23 1:17 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1947 bytes --]
On Tue, 22 Mar 2005, Alan Stern wrote:
> When per-device locking gets added to the driver model, this can be
> handled by making the PM core lock all devices before starting STR and
> unlock them all after waking up. Then any user process trying to resume a
> device in the middle will block until the system is fully awake.
That's not a sane design. You do not want 1 function locking and another
unlocking. That's trouble waiting to happen. We can solve that problem
today using a single semaphore to synchronize all PM calls. If a runtime
PM request comes in while a system PM request is being processed, that
process will block until the semaphore is dropped (when the system is
resumed). It will serialize all runtime PM calls, but those are not
performance-critical operations, and it shouldn't matter too much.
> > So I think we need to have the states in some sort of order at least so
> > the core has a notion of what is "lower" and what is "higher" power to
> > deal with that. Though I suppose we could also have optional hooks in
> > driver (pre-parent-change and post-parent-change) for driver who want to
> > be sneaky, but that gets nasty and complicated.
>
> I agree. This is the sort of boilerplate computation that is best done in
> one single place -- the PM core. Unfortunately it means that the core has
> to understand what combinations of parent-state/child-state are legal. I
> don't know how that knowledge can best be represented.
This shouldn't go in the core. The core cannot keep track of every
parent/child power state possibility. That sounds like a nightmare. It's
up to the driver for the parent (i.e. bridge device) to know and
understand. If anything, we could provide a child_suspended() method to
the parent drivers. But, I think even that is a hack. We just need a
simple mechanism for parents to monitor the power state of their children.
Can't be that hard to come up with something clean.
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503221635130.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-23 1:21 ` Benjamin Herrenschmidt
2005-03-23 1:46 ` Patrick Mochel
2005-03-23 19:06 ` David Brownell
1 sibling, 1 reply; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-23 1:21 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 3026 bytes --]
> I think the core should always call ->suspend() for a device, regardless
> of whether it thinks it's in a low power state, or inactive. This is
> specifically for the reason that a device could be a low-power runtime
> state when the system is suspended.
Exactly. And if we go all the way I suggested, that is define that
suspend/resume are the system-state callbacks, and that a new
enter_state low level callback is used for lower loevle state
management, then it makes even more sense since it would be those
suspend/resume calls that will add the additional semantic of "lock the
state until resume" that the "basic" state management doesn't need.
Note that rather than enter_state, I'd rather just have a function
pointer enter_this_state in the driver state array ...
Compatibility with existing drivers is easy: they don't have a state
array, so the system just assumes a default 2 state (running/suspended)
with no enter_this_state callback, and just calls the suspend/resume
callbacks for system-wide suspend. No local PM is supported by those
drivers, which is fine.
> To assist this, there should probably be only 1 list to hold the PM nodes,
> making the code a lot simpler.
I think we have to do real tree walking ...
> The system state transitions should be mediated through the bus driver.
> The bus should then translate it and call the driver, which will then map
> it back to a device-specific state (optionally using some bus helpers).
> We've talked about this before; I think David, Ben and I have all proposed
> something like this. Are you suggesting something different?
I would go to the device, not the bus, but with a different callback
than the low level enter state as explained above... In fact, if you
think about it, suspend/resume _do_ go through the bus which calls the
driver suspend/resume, so it's all fine.
> For runtime states, I think the bus should be the one exporting PM control
> files through sysfs, not the driver core. It will handle the display and
> setting of power states, allowing it to show userspace states that
> actually mean something, rather than just arbitrary numbers that don't map
> to every bus.
I don't agree. As I explained, power states will be device specific. Bus
power states won't be (they will be well defined) but most
devices/drivers are "leaf" devices and they will expose all sort of
fancy power states to userland.
I think drivers should have an array of states, with names, enter_state
function pointer (per state) and bit mask of state dependency indicating
the state dependency vs. parent. Additionally, states should be ordered
so the core knows how to properly cascade up/down.
> [ To show these states, the bus should use a per-device array of states
> that is filled in by the driver. This could easily include device-specific
> states that the driver includes in the array. ]
The bus has just a device (the bus driver, which unfortunately isn't
always clear in the current model), with well defined states for the
bus.
Ben.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-23 1:21 ` Benjamin Herrenschmidt
@ 2005-03-23 1:46 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503221724550.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-23 1:46 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3940 bytes --]
On Wed, 23 Mar 2005, Benjamin Herrenschmidt wrote:
>
> > I think the core should always call ->suspend() for a device, regardless
> > of whether it thinks it's in a low power state, or inactive. This is
> > specifically for the reason that a device could be a low-power runtime
> > state when the system is suspended.
>
> Exactly. And if we go all the way I suggested, that is define that
> suspend/resume are the system-state callbacks, and that a new
> enter_state low level callback is used for lower loevle state
> management, then it makes even more sense since it would be those
> suspend/resume calls that will add the additional semantic of "lock the
> state until resume" that the "basic" state management doesn't need.
Just to clarify, are you suggesting that the functional (not necesarily
literal) steps to fully suspend a device during a system state transition
are first to 'suspend' it, preventing it from being used and accessed or
being suspended/resumed from userpsace; then to have it directly enter a
low power state (e.g. via an enter_state() method)?
> Note that rather than enter_state, I'd rather just have a function
> pointer enter_this_state in the driver state array ...
Wouldn't that imply a different ->enter_state() method for each system
state?
> > To assist this, there should probably be only 1 list to hold the PM nodes,
> > making the code a lot simpler.
>
> I think we have to do real tree walking ...
Why? Note that I wasn't talking about ordering; only about consolidating
the current mess of having the active and inactive lists in the PM core.
> > For runtime states, I think the bus should be the one exporting PM control
> > files through sysfs, not the driver core. It will handle the display and
> > setting of power states, allowing it to show userspace states that
> > actually mean something, rather than just arbitrary numbers that don't map
> > to every bus.
>
> I don't agree. As I explained, power states will be device specific. Bus
> power states won't be (they will be well defined) but most
> devices/drivers are "leaf" devices and they will expose all sort of
> fancy power states to userland.
Yes. I was simply talking about who handles the userspace interaction via
sysfs. The e.g. PCI subsystem will export a sysfs file for each PCI
device.
Each device will have an array of states. I think this should be a static
sized array for the known PCI power states, and a pointer to device/driver
specific states. Each entry will have a name and a pointer to a method to
enter that state.
For devices that are not bound to a driver, this will show the standard
PCI PM states the device supports. Entering these states will be handled
by generic code, and will come with no guarantee.
When a driver is bound to a device, it will modify the static array as
necessary (removing dangerous states, and changing the method pointers),
and it will fill in the pointer for device/driver specific states.
When showing these states via sysfs, the show() method can simply access
the array for the particular state. For the setting of states, the store()
method would simply call the method pointer in the array for the requested
state.
I *think* this is similar/identical to what you've suggested before,
right?
Note that I say that the bus code handles the show()/store() methods
because it's string parsing and formatting that would be replicated
through a lot of drivers. It's not that hard, but it would all be
copy/replaced code that can be hard to deal with when bugs arise. And, it
can simply enough be done by the bus subsystem..
> I think drivers should have an array of states, with names, enter_state
> function pointer (per state) and bit mask of state dependency indicating
> the state dependency vs. parent. Additionally, states should be ordered
> so the core knows how to properly cascade up/down.
I'm not sure I get this. Could you elaborate more?
Thanks,
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503221724550.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-23 3:31 ` Benjamin Herrenschmidt
2005-03-23 18:20 ` Patrick Mochel
` (3 more replies)
0 siblings, 4 replies; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-23 3:31 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 7981 bytes --]
> Just to clarify, are you suggesting that the functional (not necesarily
> literal) steps to fully suspend a device during a system state transition
> are first to 'suspend' it, preventing it from being used and accessed or
> being suspended/resumed from userpsace; then to have it directly enter a
> low power state (e.g. via an enter_state() method)?
At the system level, low power state must be entered atomically with the
actual suspending of the driver. That is, when suspend() has returned,
th device must be sleeping. That is necessary for the good old reason
that once the parent is asleep, no way you can talk to your device.
> > Note that rather than enter_state, I'd rather just have a function
> > pointer enter_this_state in the driver state array ...
>
> Wouldn't that imply a different ->enter_state() method for each system
> state?
one enter state method for each driver state. If the driver has one
enter state for each system state, then go for it.
> > > To assist this, there should probably be only 1 list to hold the PM nodes,
> > > making the code a lot simpler.
> >
> > I think we have to do real tree walking ...
>
> Why? Note that I wasn't talking about ordering; only about consolidating
> the current mess of having the active and inactive lists in the PM core.
Because of parent/child dependencies, I'm not sure we can get away with
simple list walking.
> > > For runtime states, I think the bus should be the one exporting PM control
> > > files through sysfs, not the driver core. It will handle the display and
> > > setting of power states, allowing it to show userspace states that
> > > actually mean something, rather than just arbitrary numbers that don't map
> > > to every bus.
> >
> > I don't agree. As I explained, power states will be device specific. Bus
> > power states won't be (they will be well defined) but most
> > devices/drivers are "leaf" devices and they will expose all sort of
> > fancy power states to userland.
>
> Yes. I was simply talking about who handles the userspace interaction via
> sysfs. The e.g. PCI subsystem will export a sysfs file for each PCI
> device.
>
> Each device will have an array of states. I think this should be a static
> sized array for the known PCI power states, and a pointer to device/driver
> specific states. Each entry will have a name and a pointer to a method to
> enter that state.
I don't understand your static sized + pointer idea ...
At every level, we have devies. The core doesn't make a difference
between a device and a bus state. A bus state is just a device parent't
state. The only difference is that _we_ will strictly define the bus
states for known busses, while leaf devices will have freedom of doing
what they want.
So at every level, device -> state array. No need for fancy pointers.
> For devices that are not bound to a driver, this will show the standard
> PCI PM states the device supports. Entering these states will be handled
> by generic code, and will come with no guarantee.
Hrm... no driver -> no suspend imho... but that can be debated :)
> When a driver is bound to a device, it will modify the static array as
> necessary (removing dangerous states, and changing the method pointers),
> and it will fill in the pointer for device/driver specific states.
The driver just exposes it's array that gets attached to struct device.
No need to "modify" an existing array or whatever. In 99.9% of the
states, the array will just be a static structure in the driver.
> When showing these states via sysfs, the show() method can simply access
> the array for the particular state. For the setting of states, the store()
> method would simply call the method pointer in the array for the requested
> state.
Yes and no. If we deal with parent/child dependencies, we'll have to me
smarter than that. Again, we need to be able to do fancy things like
putting a USB bus into "suspend" (ie. USB standard suspend state, don't
mismatch with "system" suspend, though the policy for system suspend is
probably, at the USB bus level, to enter suspend anyway). For that, the
bus driver will have to make sure all child devices are in a state
compatible with the parent beeing suspended. This is why I want this
state dependency mecanism.
We can't have the USB bus driver know all possible state of childs,
that's contrary to the whole idea of letting leaf devices have any state
they want. _However_, since child devices do know the parent state (the
USB bus states have been clearly defined), they can have in their state
array, a dependency indication indicating that they have to be put into
state Y when the parent goes into state X or deeper (I want state to be
ordered to make things easier).
That sort of things requires some smarter tree walking though, and maybe
a bit of recursion, though we'll be careful here :)
Either that, or we can have a pair of routines in childs called
"parent_state_change_before" and "parent_change_state_after" to let them
deal with it locally, but that looks ugly...
Again, most devices will have a simple array, so it will end up beeing
extremely simple for driver developers to implement. Granted, it makes
bus iteration for us more complicated and pushes some complexity to the
core. But, as I explained previously:
- This is a complex problem
- I'd rather have the complexity in a single place (the core) and keep
the driver side as simple as possible
I think we would fix a lot of our problems if we had a notion of a bus
tree iterator. When the PM code needs to iterate the tree, it creates
the iterator object which registers itself somewhere.
When/if a device gets removed/added to the tree, the iterator can be
"updated" and put back into position to deal with it.
In fact, it's all simple to deal with until we have to deal with races
between tree walking vs. suspend resume. I think that can be easily
solved using the above, plus always delaying (put them in a "todo list")
all additions while a system suspend is in progress.
There additional things to think about though (regardless of the scheme
we decide to use). One of them is: a device may be added while the bus
isn't in "max" state. The other is child->parent callbacks. The device
driver itself may want to kick the bus into life, especially in case of
idle suspend.
The whole "kick the bus, I want more power" (that is child -> parent
action) I think should stay bus specific. That is, the child knows the
bus isn't in a required state to process a given request. It will call a
parent routine to "kick" the bus. The bus will then eventually change
state and the child will proceed. (Maybe we want something asynchronous
there btw... to be discussed).
The addition of a device while the bus isn't in "max" state means the
driver need a mecanism to know in which state it's starting up. That can
be added too.
> I *think* this is similar/identical to what you've suggested before,
> right?
>
> Note that I say that the bus code handles the show()/store() methods
> because it's string parsing and formatting that would be replicated
> through a lot of drivers. It's not that hard, but it would all be
> copy/replaced code that can be hard to deal with when bugs arise. And, it
> can simply enough be done by the bus subsystem..
>
> > I think drivers should have an array of states, with names, enter_state
> > function pointer (per state) and bit mask of state dependency indicating
> > the state dependency vs. parent. Additionally, states should be ordered
> > so the core knows how to properly cascade up/down.
>
> I'm not sure I get this. Could you elaborate more?
Well, states have numbers implicitely (their position in the array). For
well-defined state arrays (that is busses), those numbers can be made
nto known constants.
The child can then have an integer in each state indicating the minimum
state required for the bus for the device state to be valid. That will
deal with the "minimum dependency".
Ben.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-23 3:31 ` Benjamin Herrenschmidt
@ 2005-03-23 18:20 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231008340.17099-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 1:27 ` Patrick Mochel
` (2 subsequent siblings)
3 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-23 18:20 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1638 bytes --]
On Wed, 23 Mar 2005, Benjamin Herrenschmidt wrote:
>
> > Just to clarify, are you suggesting that the functional (not necesarily
> > literal) steps to fully suspend a device during a system state transition
> > are first to 'suspend' it, preventing it from being used and accessed or
> > being suspended/resumed from userpsace; then to have it directly enter a
> > low power state (e.g. via an enter_state() method)?
>
> At the system level, low power state must be entered atomically with the
> actual suspending of the driver. That is, when suspend() has returned,
> th device must be sleeping. That is necessary for the good old reason
> that once the parent is asleep, no way you can talk to your device.
I wasn't disputing that. What I was trying to say was the core could
effectively do:
for each device {
device->suspend(); /* Stop Device */
device->enter_state(); /* Power Down */
}
The reason I'm going down this road is because I think it could possibly
be split into such a way like this:
device->class->stop(dev, system_state);
device->bus->save_state(dev, system_state);
device->bus->enter_state(dev, system_state);
The first would perform functional-level suspension - stopping current
transactions and preventing future ones.
The second would call down to the device driver and save the device
context for the system state being entered.
The third would call down to the device driver and actually enter the low
power state.
Functionally, this is what happens. Do you see a reason to/not to break it
up programmatically?
...
Too many subjects in 1 email. Snipping and replying separately. :)
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <20050321222609.GK1390-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-22 3:08 ` Alan Stern
@ 2005-03-23 18:32 ` David Brownell
[not found] ` <200503231032.36164.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
1 sibling, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-23 18:32 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 1663 bytes --]
On Monday 21 March 2005 2:26 pm, Pavel Machek wrote:
> On Po 21-03-05 16:14:09, Alan Stern wrote:
> > On Mon, 21 Mar 2005, Pavel Machek wrote:
> >
> > > > pm_message_t power_state;
> > > >
> > > > Obviously a message isn't the same thing as a state. This looks like
> > > > something that will need to be changed in a lot of drivers when we
> > > > introduce the new notion of a power state.
> > >
> > > This is not so obvious to me.
It's a semantics thing, among others. "System transitioning to
state Z" is a message, as is "system leaving state Z". Both are
distinct from the "system state Z" being described.
Just like the system power state is distinct from the power state
of each device in the system.
> > > Message seems to represent the state
> > > driver is in quite well... Plus we need PMSG_ON for state the device
> > > gets after resume, but that's quite easy...
> >
> > So what value for power_state do you use to tell apart PCI D1 from D2?
>
> You just ask via pci_choose_state(), and it tells
> you... pci_choose_state() should be deterministic.
Gaack, that again? No. Different devices support different
states, so it's got to depend on at least the device hardware
and the target system state. Not "deterministic" as in the
current version. I'm going to have to dig up the original
code, it didn't have bugs like that.
Also driver capabilities matter, since they don't all know how
to cope with the partial device reset that may be associated with
transition from D3 (but not D2 or D1) ... but at least in that
case the driver can ensure it doesn't choose PCI_D3 unless it
can actually cope with that result.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-22 4:21 ` Benjamin Herrenschmidt
2005-03-22 17:04 ` Alan Stern
@ 2005-03-23 18:58 ` David Brownell
[not found] ` <200503231058.54311.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
1 sibling, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-23 18:58 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 4179 bytes --]
On Monday 21 March 2005 8:21 pm, Benjamin Herrenschmidt wrote:
>
> Yup, the model we are desinging now should allow for arbitrary
> transitions I suppose as long as the target state is legal vs. the
> dependencies.
Leading to the question: how are the dependencies identified and
enforced?
My current thought is that it's wrong to expect the PM core code to
do all this. Discussion on this list strongly suggests to me that
we'll never achieve a Grand Unified Theory of PM into which every
device, bus, platform, and board will fit ... unless we give up
on the notion that _everything_ be centrally controlled. The key
will be having good ways to delegate all the important issues.
So for example it would make sense to have device suspend() logic
verify any dependencies for the target state, and then just fail
cleanly if they're not satisfied.
That'll mostly be an issue for bridge drivers ... like PCI
bridges, host adapters for things like USB, hubs, and so on.
But it also shows a way to handle custom hardware, which may
not be as regular as the PC and server centric developers want
the world to be...
> > The simplest way of handling this is to allow explicitly for such
> > possibilities. When a device is asked to go from a very-low-power state
> > to a slightly-low-power state, it should be legal for the driver to leave
> > it in the very-low-power state.
>
> Well... I'm not sure about that one. If the power states represent some
> performance states, the system may want to raise the performance a bit
> at the expensve of power and would stay low perf unless a full
> transition to state "full on" is done ?
>
> I suspect it's a per driver responsibility here. I suppose common sense
> will dicate what can be allowed and what not.
I'm happy with per-driver responsibilities. Less so with the notion
of conflating power states and performance modes; they seem a bit more
orthogonal to me.
On the other hand, I've also described the role of a driver suspend()
call as just picking one of potentially many device power states that
are compatible with the target (system) state, and I think there's
common ground there. If the "very low" device power state is compatible,
nobody should care ... because the request to the driver should have been
"become compatible with this system power state", and only in unusual
cases (sysfs requests) "go into this device power state".
So, two types of request to drivers then. The main one would be to
become compatible with a given system power state; flexible. The
inflexible one would be to go into a specific device power state.
> One thing that is important if we deal with partial suspend and tree
> dependencies is the ordering...
>
> When a device is asked to enter a given state, the dependencies of the
> childs has to be checked in a different order if we are going to lower
> power than if we are going to higher power.
>
> If going to lower power (that is toward suspend), we must check the
> dependencies of childs and eventually low-power them before the parent
> is actually state changed.
>
> If going to higher power, it is the opposite.
The handful of drivers that deal with dependencies can be responsible
for that ordering. They should be able to delegate much of the work
to PM core code (else why have a PM core?).
> So I think we need to have the states in some sort of order at least so
> the core has a notion of what is "lower" and what is "higher" power to
> deal with that. Though I suppose we could also have optional hooks in
> driver (pre-parent-change and post-parent-change) for driver who want to
> be sneaky, but that gets nasty and complicated.
I suspect that having all device states in some sort of order like that
is a problem isomorphic with the Grand Unified Theory of PM, which I
said we shouldn't try to derive.
If the drivers that deal with dependencies -- "bridge" drivers, for
now -- can tell the PM core what to do with a given device (before
me, after me, skip it) the PM core could just ask those bridge drivers
to build a list, then walk the list issuing the right calls. Instead
of having a single global list, build it on demand.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503221709080.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-23 19:02 ` David Brownell
[not found] ` <200503231102.27137.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 21:08 ` Alan Stern
1 sibling, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-23 19:02 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 443 bytes --]
On Tuesday 22 March 2005 5:17 pm, Patrick Mochel wrote:
>
> We just need a
> simple mechanism for parents to monitor the power state of their children.
> Can't be that hard to come up with something clean.
Minimally they can walk the list of their children at appropriate moments.
I think folk have also raised the notion of having a "child state changed"
call to the parent; it's not clear to me that needs to involve the PM core.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503221635130.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-23 1:21 ` Benjamin Herrenschmidt
@ 2005-03-23 19:06 ` David Brownell
[not found] ` <200503231106.03160.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
1 sibling, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-23 19:06 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 585 bytes --]
On Tuesday 22 March 2005 4:52 pm, Patrick Mochel wrote:
> >
> I think the core should always call ->suspend() for a device, regardless
> of whether it thinks it's in a low power state, or inactive. This is
> specifically for the reason that a device could be a low-power runtime
> state when the system is suspended.
I don't quite see a need for this. If the parent/bridge driver knows
the device is adequately suspended, why kick it again? It's actually
rather annoying -- and error/bug prone! -- to have to code drivers to
detect and cope with superfluous suspend calls.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503231058.54311.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-23 19:37 ` Jordan Crouse
[not found] ` <20050323123725.201d8a67-aftB2sG12IhaqnLngUycEA@public.gmane.org>
2005-03-23 23:24 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 72+ messages in thread
From: Jordan Crouse @ 2005-03-23 19:37 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 1340 bytes --]
On Wed, 23 Mar 2005 10:58:54 -0800
"David Brownell" <david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org> wrote:
> So, two types of request to drivers then. The main one would be to
> become compatible with a given system power state; flexible. The
> inflexible one would be to go into a specific device power state.
I like the idea of the flexible request. This would add the ability for
the policy to individually manage devices that for one reason or another
cannot or should not enter a power state compatible with the system
power state.
As an illustrative example, I'm thinking of a fictitious VoIP phone with
an audio device that has an abnormally large latency resuming from a
clocks off power state, so by the time it wakes up and is ready to go,
the incoming call is lost. At this point, the user/developer/designer
could decide that the the extra power consumed by leaving the clocks on
is less important then having a responsive device, and the policy is set
so that the device only enters a D1 state with a suspend-to-ram system
power state, rather then a D3 state as it normally would (pardon the
PCI/ACPI terms, they're just for simplicity).
In that case, even though the device state wouldn't technically be
compatible with the given system state, it would still be the
best fit for the platform as a whole.
Jordan
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503231106.03160.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-23 20:29 ` Nigel Cunningham
[not found] ` <1111609769.14853.104.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
2005-03-24 2:13 ` Patrick Mochel
1 sibling, 1 reply; 72+ messages in thread
From: Nigel Cunningham @ 2005-03-23 20:29 UTC (permalink / raw)
To: David Brownell; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1167 bytes --]
Hi
On Thu, 2005-03-24 at 06:06, David Brownell wrote:
> On Tuesday 22 March 2005 4:52 pm, Patrick Mochel wrote:
> > >
> > I think the core should always call ->suspend() for a device, regardless
> > of whether it thinks it's in a low power state, or inactive. This is
> > specifically for the reason that a device could be a low-power runtime
> > state when the system is suspended.
>
> I don't quite see a need for this. If the parent/bridge driver knows
> the device is adequately suspended, why kick it again? It's actually
> rather annoying -- and error/bug prone! -- to have to code drivers to
> detect and cope with superfluous suspend calls.
I would think it would be simpler not to have the parent/bridge check
whether the device is already suspended, and to just have all the logic
that decides what to do in the driver. I'm only thinking intuitively,
but it sounds like that should help with avoiding race conditions.
Regards,
Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574
Maintainer of Suspend2 Kernel Patches http://suspend2.net
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503231102.27137.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-23 20:36 ` Nigel Cunningham
0 siblings, 0 replies; 72+ messages in thread
From: Nigel Cunningham @ 2005-03-23 20:36 UTC (permalink / raw)
To: David Brownell; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1056 bytes --]
Hi.
On Thu, 2005-03-24 at 06:02, David Brownell wrote:
> On Tuesday 22 March 2005 5:17 pm, Patrick Mochel wrote:
> >
> > We just need a
> > simple mechanism for parents to monitor the power state of their children.
> > Can't be that hard to come up with something clean.
>
> Minimally they can walk the list of their children at appropriate moments.
> I think folk have also raised the notion of having a "child state changed"
> call to the parent; it's not clear to me that needs to involve the PM core.
Yes. The children should notify the parent, not the parent poll the
children. Polling is ugly :>
On top of this, the evaluation of the state of the children should be
atomic. Thus, their notifications of idleness might simply atomically
increment/decrement a counter of the number of children busy, for
example.
Regards.
Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574
Maintainer of Suspend2 Kernel Patches http://suspend2.net
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <1111609769.14853.104.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
@ 2005-03-23 20:55 ` David Brownell
2005-03-23 21:18 ` Alan Stern
1 sibling, 0 replies; 72+ messages in thread
From: David Brownell @ 2005-03-23 20:55 UTC (permalink / raw)
To: ncunningham-3EexvZdKGZRWk0Htik3J/w; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1432 bytes --]
On Wednesday 23 March 2005 12:29 pm, Nigel Cunningham wrote:
> Hi
>
> On Thu, 2005-03-24 at 06:06, David Brownell wrote:
> > On Tuesday 22 March 2005 4:52 pm, Patrick Mochel wrote:
> > > >
> > > I think the core should always call ->suspend() for a device, regardless
> > > of whether it thinks it's in a low power state, or inactive. This is
> > > specifically for the reason that a device could be a low-power runtime
> > > state when the system is suspended.
> >
> > I don't quite see a need for this. If the parent/bridge driver knows
> > the device is adequately suspended, why kick it again? It's actually
> > rather annoying -- and error/bug prone! -- to have to code drivers to
> > detect and cope with superfluous suspend calls.
>
> I would think it would be simpler not to have the parent/bridge check
> whether the device is already suspended, and to just have all the logic
> that decides what to do in the driver. I'm only thinking intuitively,
> but it sounds like that should help with avoiding race conditions.
Maybe the solution should involve a parent/child handshake; that'd
be the place to address potential races, and it could include the
notification folk seem to want.
You're suggesting a tradeoff which shifts work from the bridges,
which won't be numerous, to the drivers, which will be. I still
don't see a need to do that, and I'd rather take opportunities to
make the drivers simpler.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503231032.36164.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-23 21:00 ` Pavel Machek
0 siblings, 0 replies; 72+ messages in thread
From: Pavel Machek @ 2005-03-23 21:00 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 714 bytes --]
Hi!
> > > > > pm_message_t power_state;
> > > > >
> > > > > Obviously a message isn't the same thing as a state. This looks like
> > > > > something that will need to be changed in a lot of drivers when we
> > > > > introduce the new notion of a power state.
> > > >
> > > > This is not so obvious to me.
>
> It's a semantics thing, among others. "System transitioning to
> state Z" is a message, as is "system leaving state Z". Both are
> distinct from the "system state Z" being described.
But that does not mean you can't use same type for both.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231008340.17099-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-23 21:02 ` Pavel Machek
[not found] ` <20050323210204.GE30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-23 23:14 ` Nested suspends; messages vs. states Benjamin Herrenschmidt
1 sibling, 1 reply; 72+ messages in thread
From: Pavel Machek @ 2005-03-23 21:02 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]
Hi!
> > At the system level, low power state must be entered atomically with the
> > actual suspending of the driver. That is, when suspend() has returned,
> > th device must be sleeping. That is necessary for the good old reason
> > that once the parent is asleep, no way you can talk to your device.
>
> I wasn't disputing that. What I was trying to say was the core could
> effectively do:
>
> for each device {
> device->suspend(); /* Stop Device */
> device->enter_state(); /* Power Down */
> }
>
> The reason I'm going down this road is because I think it could possibly
> be split into such a way like this:
>
> device->class->stop(dev, system_state);
> device->bus->save_state(dev, system_state);
> device->bus->enter_state(dev, system_state);
>
>
> The first would perform functional-level suspension - stopping current
> transactions and preventing future ones.
>
> The second would call down to the device driver and save the device
> context for the system state being entered.
>
> The third would call down to the device driver and actually enter the low
> power state.
>
> Functionally, this is what happens. Do you see a reason to/not to break it
> up programmatically?
Yes.
There are many drivers that do not fit your idea of "driver". Like
mtrrs. Some drivers only ever do work on resume, etc. Forcing driver
to think how to split it into class->stop, bus->save_state and
bus->enter state is bad idea. [Notice that almost no drivers need
->save_state operation...]
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503221709080.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-23 19:02 ` David Brownell
@ 2005-03-23 21:08 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503231544550.631-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
1 sibling, 1 reply; 72+ messages in thread
From: Alan Stern @ 2005-03-23 21:08 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2109 bytes --]
On Tue, 22 Mar 2005, Patrick Mochel wrote:
> On Tue, 22 Mar 2005, Alan Stern wrote:
>
> > When per-device locking gets added to the driver model, this can be
> > handled by making the PM core lock all devices before starting STR and
> > unlock them all after waking up. Then any user process trying to resume a
> > device in the middle will block until the system is fully awake.
>
> That's not a sane design. You do not want 1 function locking and another
> unlocking. That's trouble waiting to happen. We can solve that problem
> today using a single semaphore to synchronize all PM calls. If a runtime
> PM request comes in while a system PM request is being processed, that
> process will block until the semaphore is dropped (when the system is
> resumed). It will serialize all runtime PM calls, but those are not
> performance-critical operations, and it shouldn't matter too much.
1 function locking and another unlocking? Not being aware of all the
details of the system-power-management code, I kind of assumed STR looked
something like this:
void suspend_to_ram()
{
suspend all devices
turn off main power
/* Zzzz... */
resume all devices
}
So what's wrong with changing it into:
void suspend_to_ram()
{
lock all devices
suspend all devices
turn off main power
/* Zzzz... */
resume all devices
unlock all devices
}
I agree that your single-semaphore idea would also work to prevent
unwanted resumes during a system sleep. But it doesn't protect against
the possibility of devices being removed or added while the PM core is
traversing its lists. Locking does protect against that -- or rather, it
will once the mechanisms have been added to the driver model core.
Now maybe the list traversal doesn't need protection. Something like
"Always suspend next the most recently added device that hasn't already
been suspended" would probably work okay. But given that during a system
sleep transition the PM core is supposed to be in control of all devices
for the entire time they are suspended, doesn't it make sense to have it
lock them?
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <1111609769.14853.104.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
2005-03-23 20:55 ` David Brownell
@ 2005-03-23 21:18 ` Alan Stern
1 sibling, 0 replies; 72+ messages in thread
From: Alan Stern @ 2005-03-23 21:18 UTC (permalink / raw)
To: Nigel Cunningham; +Cc: David Brownell, Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1265 bytes --]
On Thu, 24 Mar 2005, Nigel Cunningham wrote:
> Hi
>
> On Thu, 2005-03-24 at 06:06, David Brownell wrote:
> > On Tuesday 22 March 2005 4:52 pm, Patrick Mochel wrote:
> > > >
> > > I think the core should always call ->suspend() for a device, regardless
> > > of whether it thinks it's in a low power state, or inactive. This is
> > > specifically for the reason that a device could be a low-power runtime
> > > state when the system is suspended.
> >
> > I don't quite see a need for this. If the parent/bridge driver knows
> > the device is adequately suspended, why kick it again? It's actually
> > rather annoying -- and error/bug prone! -- to have to code drivers to
> > detect and cope with superfluous suspend calls.
>
> I would think it would be simpler not to have the parent/bridge check
> whether the device is already suspended, and to just have all the logic
> that decides what to do in the driver. I'm only thinking intuitively,
> but it sounds like that should help with avoiding race conditions.
David and Patrick are talking about two different things without realizing
it. Patrick said the _core_ should always call bus->suspend. Dave said
the _parent/bridge/bus_ doesn't have to call driver->suspend. Both can be
correct.
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <20050323210204.GE30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-23 21:35 ` Nigel Cunningham
[not found] ` <1111613750.14853.117.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Nigel Cunningham @ 2005-03-23 21:35 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 818 bytes --]
Hi.
On Thu, 2005-03-24 at 08:02, Pavel Machek wrote:
> Yes.
>
> There are many drivers that do not fit your idea of "driver". Like
> mtrrs. Some drivers only ever do work on resume, etc. Forcing driver
> to think how to split it into class->stop, bus->save_state and
> bus->enter state is bad idea. [Notice that almost no drivers need
> ->save_state operation...]
I don't think MTRRs should be counted as drivers. Rather, they should be
counted as part of the CPU state(s), to be saved and restored when CPU
context is saved and restored. Treating them as drivers leads to races
:>
Regards,
Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574
Maintainer of Suspend2 Kernel Patches http://suspend2.net
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <1111613750.14853.117.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
@ 2005-03-23 21:54 ` Pavel Machek
[not found] ` <20050323215416.GK30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Pavel Machek @ 2005-03-23 21:54 UTC (permalink / raw)
To: Nigel Cunningham; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 819 bytes --]
Hi!
> > There are many drivers that do not fit your idea of "driver". Like
> > mtrrs. Some drivers only ever do work on resume, etc. Forcing driver
> > to think how to split it into class->stop, bus->save_state and
> > bus->enter state is bad idea. [Notice that almost no drivers need
> > ->save_state operation...]
>
> I don't think MTRRs should be counted as drivers. Rather, they should be
> counted as part of the CPU state(s), to be saved and restored when CPU
> context is saved and restored. Treating them as drivers leads to races
> :>
Okay, I'll need to make cpu hotplug deal with them. But MTRR are not
the only strange device. Think timer, for example.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231008340.17099-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-23 21:02 ` Pavel Machek
@ 2005-03-23 23:14 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-23 23:14 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 2221 bytes --]
On Wed, 2005-03-23 at 10:20 -0800, Patrick Mochel wrote:
> I wasn't disputing that. What I was trying to say was the core could
> effectively do:
>
> for each device {
> device->suspend(); /* Stop Device */
> device->enter_state(); /* Power Down */
> }
Where is room in the above for suspend() to be the one deciding what
state to enter ? suspend() has no way to do it with your example. I'd
rather have suspend optionally call device_enter_state() itself (and not
device->enter_state if we decide to have a function pointer per state in
the state array, those won't be pointers in struct device, which is
probably better since enter_state() would end up just switch/casing on
them anyway).
> The reason I'm going down this road is because I think it could possibly
> be split into such a way like this:
>
> device->class->stop(dev, system_state);
> device->bus->save_state(dev, system_state);
> device->bus->enter_state(dev, system_state);
>
>
> The first would perform functional-level suspension - stopping current
> transactions and preventing future ones.
It's up to the driver's suspend() to do those. I would keep that policy
in there, really. I agree that a class->stop() may be useful for putting
the "common" code, but I'd rather have it seen as a "helper" that the
driver can call.
> The second would call down to the device driver and save the device
> context for the system state being entered.
>
> The third would call down to the device driver and actually enter the low
> power state.
save_state would be nop most of the time. enter_state may have to save
things too... I think the distinction between those is very academic and
makes little sense in practice.
> Functionally, this is what happens. Do you see a reason to/not to break it
> up programmatically?
Because I think it's a lot more flexible not to do it :) That is,
breaking it up _imposes_ a structure on driver that I'd rather avoid in
this case. Just let them have suspend(), that can eventually use the
state mecanism and maybe a class "stop" helper, but some drivers will
want to do tricks etc... or do things before stopping the class, you
can't even imagine now :) So I'd rather not have the core do that.
Ben.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503231058.54311.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 19:37 ` Jordan Crouse
@ 2005-03-23 23:24 ` Benjamin Herrenschmidt
2005-03-24 2:45 ` David Brownell
1 sibling, 1 reply; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-23 23:24 UTC (permalink / raw)
To: David Brownell; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 4912 bytes --]
On Wed, 2005-03-23 at 10:58 -0800, David Brownell wrote:
> On Monday 21 March 2005 8:21 pm, Benjamin Herrenschmidt wrote:
> >
> > Yup, the model we are desinging now should allow for arbitrary
> > transitions I suppose as long as the target state is legal vs. the
> > dependencies.
>
> Leading to the question: how are the dependencies identified and
> enforced?
I have proposed something, if you read my other mails, that should deal
with most of the usual cases.
> My current thought is that it's wrong to expect the PM core code to
> do all this. Discussion on this list strongly suggests to me that
> we'll never achieve a Grand Unified Theory of PM into which every
> device, bus, platform, and board will fit ... unless we give up
> on the notion that _everything_ be centrally controlled. The key
> will be having good ways to delegate all the important issues.
The problem with your approach is since you have in mind those extremely
weird embedded setups that can't fit in any sane model, you decide to
reject any model that does anything useful at the core. I don't agree
with your logic. Most embedded systems even would fit in the simple
model. In the case of on-chip devices driven by clock nets, it's even
possible to just consider clock domains as "busses" and miracle ! we end
up with a nice discrete list of bus state to apply to drivers and
dependencies that fit the proposed model.
> So for example it would make sense to have device suspend() logic
> verify any dependencies for the target state, and then just fail
> cleanly if they're not satisfied.
How ? It doesn't have the necessary knowledge unless we start moving a
lot of burden into drivers. And every time we'll move some algorithmic
requirements like that to drivers, 99% of them will get it wrong.
> That'll mostly be an issue for bridge drivers ... like PCI
> bridges, host adapters for things like USB, hubs, and so on.
Bridges/busses have no knowledges of device states, they can't verify
dependencies unless the devices expose their state list & dependency
information in a meaningful way, which is exactly what I'm proposing.
> But it also shows a way to handle custom hardware, which may
> not be as regular as the PC and server centric developers want
> the world to be...
Yah, I know those, I have been doing embedded dev too, and in most
cases, they aren't _that_ bad. And if they are ? well, they'll hack
around like they do today.
> I'm happy with per-driver responsibilities. Less so with the notion
> of conflating power states and performance modes; they seem a bit more
> orthogonal to me.
They tend to be very tied in practice though ...
> On the other hand, I've also described the role of a driver suspend()
> call as just picking one of potentially many device power states that
> are compatible with the target (system) state, and I think there's
> common ground there. If the "very low" device power state is compatible,
> nobody should care ... because the request to the driver should have been
> "become compatible with this system power state", and only in unusual
> cases (sysfs requests) "go into this device power state".
>
> So, two types of request to drivers then. The main one would be to
> become compatible with a given system power state; flexible. The
> inflexible one would be to go into a specific device power state.
But a device power state has an impact on childs of this device, thus
the dependency issue. I am not talking about system states here. Unless
you want to completely phase out dyamic PM of busses...
> The handful of drivers that deal with dependencies can be responsible
> for that ordering. They should be able to delegate much of the work
> to PM core code (else why have a PM core?).
No, I don't agree. That would be huge code duplication with different
kinds of bugs in <name all busses in linux, and that is a LOT>
> I suspect that having all device states in some sort of order like that
> is a problem isomorphic with the Grand Unified Theory of PM, which I
> said we shouldn't try to derive.
Bla bla bla bla... I am NOT proposing a f*king Grand Unified Whatever,
I'm proposing a _model_ which should deal with a vast majority of issues
with a simple driver side API, period. I'm not trying to put in
everything nor all the crazy cases. It seems you are reasoning in an
all-or-nothing way, and since all is not desirable nor even possible in
a sane way, then you say, let's do nothing. I do NOT agree.
> If the drivers that deal with dependencies -- "bridge" drivers, for
> now -- can tell the PM core what to do with a given device (before
> me, after me, skip it) the PM core could just ask those bridge drivers
> to build a list, then walk the list issuing the right calls. Instead
> of having a single global list, build it on demand.
>
> - Dave
--
Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503221216430.954-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-23 23:49 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-23 23:49 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 1913 bytes --]
On Tue, 2005-03-22 at 12:24 -0500, Alan Stern wrote:
> On Tue, 22 Mar 2005, Pavel Machek wrote:
>
> > And do we really want user writing D2 to /sys file?
>
> Yes, absolutely. And we want the power/state file to contain "D2" when
> a PCI device is in that state.
No. "D2" isn't a leaf device state imho. It's a PCI state tho, but I
would have the driver expose some more meaninful names, like "standby",
"suspended", ... If the entire PCI busses goes unclocked, we could have
the PCI _bus_ expose D2. If the PCI bus is about to lose power, we could
haev the PCI _bus_ expose D3. But the leaf driver should be more
meaninful, that is the power states should be relative to the function
of the driver.
> > > Even just from first principles the mistake is apparent. pm_message_t is
> > > (or will be, when the structure is defined in its final form) a _message_,
> > > not a _state_. It contains (will contain) things other than the power
> > > state setting, such as the "flags" field. Why would a device want to
> > > store pm_message_t.flags as part of its current state?
> >
> > Because device may enter different hw states for different flags?
>
> But once the device is in a particular state, the reason why it entered
> that state doesn't matter any more. Certainly it shouldn't be _part_ of
> the state.
It does matter for one thing: Wether the device can get out of the state
by itself (upon reception of a request for example) or should it block
all activities until resume().
> Consider this: Device states are bus- or device-specific structures, as
> discussed before. But the PM core can export a set of minimal generic
> state structures for use by drivers that don't need anything more
> complicated than On, Frozen, or Suspended. How does that sound?
Yes. That is also a compatibility path from current scheme. But I
wouldn't make these states accessibles to userland ...
Ben.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-23 3:31 ` Benjamin Herrenschmidt
2005-03-23 18:20 ` Patrick Mochel
@ 2005-03-24 1:27 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231724100.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 1:41 ` Patrick Mochel
2005-03-24 2:05 ` Patrick Mochel
3 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 1:27 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 953 bytes --]
[ Short replies, to prevent some messiness. ]
On Wed, 23 Mar 2005, Benjamin Herrenschmidt wrote:
> > > Note that rather than enter_state, I'd rather just have a function
> > > pointer enter_this_state in the driver state array ...
> >
> > Wouldn't that imply a different ->enter_state() method for each system
> > state?
>
> one enter state method for each driver state. If the driver has one
> enter state for each system state, then go for it.
Two things:
1) I meant just 1 ->enter_state() entry point for the core to call. It
won't call a different function depending on the state; it will leave it
up to the driver to determine what state to enter/what function to call.
Internally (or in its bus core), is where the array of enter_state()
methods would reside. Do you agree?
2) The system states are totally dependent on the platform. I don't see
how we could have a sane array that encapsulates every possible system
state. Thoughts?
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-23 3:31 ` Benjamin Herrenschmidt
2005-03-23 18:20 ` Patrick Mochel
2005-03-24 1:27 ` Patrick Mochel
@ 2005-03-24 1:41 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231727220.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 2:05 ` Patrick Mochel
3 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 1:41 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1148 bytes --]
On Wed, 23 Mar 2005, Benjamin Herrenschmidt wrote:
> > When a driver is bound to a device, it will modify the static array as
> > necessary (removing dangerous states, and changing the method pointers),
> > and it will fill in the pointer for device/driver specific states.
>
> The driver just exposes it's array that gets attached to struct device.
> No need to "modify" an existing array or whatever. In 99.9% of the
> states, the array will just be a static structure in the driver.
I'm talking about an array of device states the device supports. They
would be exporting them to userspace for the purpose of device power
management. While it's related to the device states a driver supports,
it's not exactly the same thing.
You want to expose only the mutual set of those sets (the states the
device and the driver both support) on a per-device basis. The easiest way
to do that would be with a per-device array. You could do it with a
per-driver array, but in the case where a driver supports different revs
of a chip with different power management capabilities, you'd have to
constantly check what rev of the chip you're using..
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-23 3:31 ` Benjamin Herrenschmidt
` (2 preceding siblings ...)
2005-03-24 1:41 ` Patrick Mochel
@ 2005-03-24 2:05 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231742090.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
3 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 2:05 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 4166 bytes --]
On Wed, 23 Mar 2005, Benjamin Herrenschmidt wrote:
> > When showing these states via sysfs, the show() method can simply access
> > the array for the particular state. For the setting of states, the store()
> > method would simply call the method pointer in the array for the requested
> > state.
>
> Yes and no. If we deal with parent/child dependencies, we'll have to me
> smarter than that. Again, we need to be able to do fancy things like
> putting a USB bus into "suspend" (ie. USB standard suspend state, don't
> mismatch with "system" suspend, though the policy for system suspend is
> probably, at the USB bus level, to enter suspend anyway). For that, the
> bus driver will have to make sure all child devices are in a state
> compatible with the parent beeing suspended. This is why I want this
> state dependency mecanism.
>
> We can't have the USB bus driver know all possible state of childs,
> that's contrary to the whole idea of letting leaf devices have any state
> they want. _However_, since child devices do know the parent state (the
> USB bus states have been clearly defined), they can have in their state
> array, a dependency indication indicating that they have to be put into
> state Y when the parent goes into state X or deeper (I want state to be
> ordered to make things easier).
Are the leaf devices ever going to enter some random, ill-defined state?
While a device could enter a number of states, that set seems finite.
Correct me if I'm wrong, I only know PM from a PCI perpsective.
For PCI, there are 4 possible states a device could be in (ok 5, counting
D3-cold). How many power states are there in USB?
It would be trivial to add a set of lists to each bridge driver to hold
each device that is in a particular state. E.g. for PCI that would be:
struct list_head devices_d0;
struct list_head devices_d1;
struct list_head devices_d2;
struct list_head devices_d3;
struct list_head devices_d3cold;
As devices are discovered and bound, they are put on the devices_d0 list.
As runtime power management happens, they would be moved to the
appropriate lists based on the power state they entered. When a bridge was
told to go into a certain power state, it could easily iterate over all
the devices that were only in a power state that had to change.
It would be trivial for a bus to do automatic opportunistic power
management. It could quickly check what was the lowest state it could
enter based on the highest power state a child could have:
if (list_empty(&devices_d0)) {
if (list_empty(&devices_d1)) {
if (list_empty(&devices_d2)) {
enter_b3();
} else {
enter_b2();
}
} else {
enter_b1();
}
}
Or something like that. :)
> Again, most devices will have a simple array, so it will end up beeing
> extremely simple for driver developers to implement. Granted, it makes
> bus iteration for us more complicated and pushes some complexity to the
> core. But, as I explained previously:
>
> - This is a complex problem
> - I'd rather have the complexity in a single place (the core) and keep
> the driver side as simple as possible
>
> I think we would fix a lot of our problems if we had a notion of a bus
> tree iterator. When the PM code needs to iterate the tree, it creates
> the iterator object which registers itself somewhere.
I agree, and it's easy enough to think of things with a bus-centric view.
But, how does that add complexity to the core? I envision the core doing
something like this:
- Keep a hierarchical list of buses
- Iterate over buses to put them to sleep
If we kept it at that, we could just call down to the bridge drivers and
have them iterate over the devices on their bus to suspend them. This
would push all the handling of leaf devices to the bus subsystems
themselves. That would keep the core simple, not matter to the leaf device
drivers, and place the burden on the bridge drivers.
The bridge driver largely don't exist (except for USB hubs), the
requirements aren't very tough, and it would localize the semantics where
they need to be - in the bus subsystems.
Seems like a win all around..
Ok, now I'll read the rest of the threads..
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503231106.03160.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 20:29 ` Nigel Cunningham
@ 2005-03-24 2:13 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231810400.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
1 sibling, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 2:13 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1235 bytes --]
On Wed, 23 Mar 2005, David Brownell wrote:
> On Tuesday 22 March 2005 4:52 pm, Patrick Mochel wrote:
> > >
> > I think the core should always call ->suspend() for a device, regardless
> > of whether it thinks it's in a low power state, or inactive. This is
> > specifically for the reason that a device could be a low-power runtime
> > state when the system is suspended.
>
> I don't quite see a need for this. If the parent/bridge driver knows
> the device is adequately suspended, why kick it again? It's actually
> rather annoying -- and error/bug prone! -- to have to code drivers to
> detect and cope with superfluous suspend calls.
The call doesn't need to actually kick the device. It just needs to check
some state field in the device object. My point was that the core
shouldn't differentiate between not-suspended and suspended devices.
Note that this point would be moot if we moved to a complete bus-centric
view of the tree. We wouldn't care about individual devices; we would just
call the bridge drivers, which could do their own checking on which
devices were suspended and needed to be called. Note that that would be
very trivial with lists of devices in each particular state, like I just
suggested. :)
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231727220.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 2:22 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-24 2:22 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1976 bytes --]
> I'm talking about an array of device states the device supports. They
> would be exporting them to userspace for the purpose of device power
> management. While it's related to the device states a driver supports,
> it's not exactly the same thing.
I would have a single array. Just use a flag in there to mark the ones
that are visible to userspace.
> You want to expose only the mutual set of those sets (the states the
> device and the driver both support) on a per-device basis. The easiest way
> to do that would be with a per-device array. You could do it with a
> per-driver array, but in the case where a driver supports different revs
> of a chip with different power management capabilities, you'd have to
> constantly check what rev of the chip you're using..
Too many arrays, we are going to complicate the driver-side of the model
too much. I'm thinking more about an array that can be pointed to
statically from the driver structure (which is usually static as well).
This is important for a reason: Devices may be added to a part of the
tree that is in a low power state. So the actual "state" upon probe() is
_not_ necessarily the "main" power state. The core must be able to
resolve the state dependency at device add time before probe I think, or
at least know what value to put in "current state" before probe(), so
the driver can eventually try to clamp the state up on the parent if it
needs more power to initialize itself.
Something like
struct pm_state_list my_states = {
{ "running", PM_STATE_USER, .... <whatever else> },
{ "snoozing", PM_STATE_USER, .... <whatever else> },
{ "suspended", PM_STATE_USER, .... <whatever else> },
};
struct pci_driver my_driver = {
.../...
dev.pm_states = &my_states;
};
The above array is just a random example, we definitely have to work
more on the defintion of a state. In that example, we have the state
name and a flags word, PM_STATE_USER indicating the state can be set by
the user.
Ben.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231742090.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 2:29 ` Benjamin Herrenschmidt
2005-03-24 5:02 ` David Brownell
1 sibling, 0 replies; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-24 2:29 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 4689 bytes --]
> Are the leaf devices ever going to enter some random, ill-defined state?
> While a device could enter a number of states, that set seems finite.
> Correct me if I'm wrong, I only know PM from a PCI perpsective.
>
> For PCI, there are 4 possible states a device could be in (ok 5, counting
> D3-cold). How many power states are there in USB?
Power states of a device go far beyond what their bus provide. For
example, I have ideas of using that to provide a way for radeonfb to
underclock the video chip, with significant gain on power consumptions.
There may be plenty other operational modes on a given piece of HW that
aren't necesarily related to the PCI PM state. The later is just a
"tool" for use by the driver, they aren't really useful to expose in
practice.
> It would be trivial to add a set of lists to each bridge driver to hold
> each device that is in a particular state. E.g. for PCI that would be:
>
> struct list_head devices_d0;
> struct list_head devices_d1;
> struct list_head devices_d2;
> struct list_head devices_d3;
> struct list_head devices_d3cold;
But they make no sense ! Have you any driver writing experience ? :)
Specs are nice, but sometimes quite far from reality. Those states
aren't even properly defined by the PCI Spec (their actual HW meaning is
not), and what state to enter for a given state and what is the effect
of that state is totally device dependant. Some devices support only a
subset of them, etc etc etc...
> As devices are discovered and bound, they are put on the devices_d0 list.
> As runtime power management happens, they would be moved to the
> appropriate lists based on the power state they entered. When a bridge was
> told to go into a certain power state, it could easily iterate over all
> the devices that were only in a power state that had to change.
No, we shouldn't even care about the PCI PM states IMHO. Drivers may
chose to put their device in a given PCI PM state because they know that
on such HW, that PM state has this specific effect etc... but that's not
something we want to expose beyond that. There _is_ some platform
requirements on PCI PM states for sustem suspend though, and we need a
way to address them (via pci_choose_state or equivalent maybe) but that
isn't even always properly dealt with by all HW anyway.
>It would be trivial for a bus to do automatic opportunistic power
> management. It could quickly check what was the lowest state it could
> enter based on the highest power state a child could have:
>
> if (list_empty(&devices_d0)) {
> if (list_empty(&devices_d1)) {
> if (list_empty(&devices_d2)) {
> enter_b3();
> } else {
> enter_b2();
> }
> } else {
> enter_b1();
> }
> }
>
> Or something like that. :)
Excepot that there is nothing like a definition of what a "D2" state
means to a PCI bus ... If it meant "unclocked" (which is _usually_ the
case with some devices, assuming D2 means you can remove the clock but
not power), but it's not even properly specified.
> I agree, and it's easy enough to think of things with a bus-centric view.
> But, how does that add complexity to the core? I envision the core doing
> something like this:
>
> - Keep a hierarchical list of buses
> - Iterate over buses to put them to sleep
The core should only care about devices. A bus is just a special case of
device with childs... and thus a possible dependency. States exposed by
busses are device states. PCI cannot really expose any since the PCI D
states don't really have any meaning. The only thing PCI can expose with
some useful sanity is "clocked removed" and "power removed" ... There
may be room for a "low power" (only the minimum sleep power is
provided). Drivers could act on those, since those are states that
actually _mean_ something to the HW, thus drivers designers could take
proper decisions on what to do. At the bus level, D1 or D2 have no
meaning. The PCI spec is broken in this regard (and many others ...)
> If we kept it at that, we could just call down to the bridge drivers and
> have them iterate over the devices on their bus to suspend them. This
> would push all the handling of leaf devices to the bus subsystems
> themselves. That would keep the core simple, not matter to the leaf device
> drivers, and place the burden on the bridge drivers.
>
> The bridge driver largely don't exist (except for USB hubs), the
> requirements aren't very tough, and it would localize the semantics where
> they need to be - in the bus subsystems.
>
> Seems like a win all around..
>
> Ok, now I'll read the rest of the threads..
>
>
> Pat
--
Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503231544550.631-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-24 2:35 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231827310.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 2:35 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1697 bytes --]
On Wed, 23 Mar 2005, Alan Stern wrote:
> So what's wrong with changing it into:
>
> void suspend_to_ram()
> {
> lock all devices
> suspend all devices
> turn off main power
> /* Zzzz... */
> resume all devices
> unlock all devices
> }
>
> I agree that your single-semaphore idea would also work to prevent
> unwanted resumes during a system sleep. But it doesn't protect against
> the possibility of devices being removed or added while the PM core is
> traversing its lists. Locking does protect against that -- or rather, it
> will once the mechanisms have been added to the driver model core.
That would add a lot of complication. You never want to add more locks
than you need to. Without locking, each suspend operation is a single
discete action with no dependencies on anything else. Having to lock every
device now intertwines them all in a very complicated and disgusting
manner. It would make it potentially very hard to debug and add a lot of
time to the process.
Note that you're locking ideas are at least original, but I'm having a
harder and harder time taking them seriously without any code. I highly
recommend that you at least try to codify the locking changes before
making suggestions. It will weed out a lot of under-cooked ideas and get
us a lot closer to a workable solution. As Linus would say "Show me the
code!"
Take this from someone that has had to make provably correct shotgun
locking changes to code already in the kernel to appease a screaming Al
Viro (whose usual statement is something along the lines of only "There's
a race in function foo(). You idiot."), and a host of people that believe
his word as gospel by default.
Thanks,
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <20050323215416.GK30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-24 2:40 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231838570.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 2:40 UTC (permalink / raw)
To: Pavel Machek; +Cc: Nigel Cunningham, Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1151 bytes --]
On Wed, 23 Mar 2005, Pavel Machek wrote:
> Hi!
>
> > > There are many drivers that do not fit your idea of "driver". Like
> > > mtrrs. Some drivers only ever do work on resume, etc. Forcing driver
> > > to think how to split it into class->stop, bus->save_state and
> > > bus->enter state is bad idea. [Notice that almost no drivers need
> > > ->save_state operation...]
> >
> > I don't think MTRRs should be counted as drivers. Rather, they should be
> > counted as part of the CPU state(s), to be saved and restored when CPU
> > context is saved and restored. Treating them as drivers leads to races
> > :>
>
> Okay, I'll need to make cpu hotplug deal with them. But MTRR are not
> the only strange device. Think timer, for example.
That's a system device (like the current MTRR). In the name of sanity,
please don't make exceptions to the model based on them or platform
devices. Both of those models are FITH and need to be re-done.
Tangentially, if anyone has decent ideas on how to better represent those
types of devices, I'm very intersted in knowing. I'd like to see them
fixed up and cleaned up in the next few months..
Thanks,
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-23 23:24 ` Benjamin Herrenschmidt
@ 2005-03-24 2:45 ` David Brownell
[not found] ` <200503231845.55392.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-24 2:45 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 5845 bytes --]
On Wednesday 23 March 2005 3:24 pm, Benjamin Herrenschmidt wrote:
> On Wed, 2005-03-23 at 10:58 -0800, David Brownell wrote:
> > On Monday 21 March 2005 8:21 pm, Benjamin Herrenschmidt wrote:
> > >
> > > Yup, the model we are desinging now should allow for arbitrary
> > > transitions I suppose as long as the target state is legal vs. the
> > > dependencies.
> >
> > Leading to the question: how are the dependencies identified and
> > enforced?
>
> I have proposed something, if you read my other mails, that should deal
> with most of the usual cases.
Some of those notions made sense to me, not all. So far it's fair
to say there's more handwaving (from everyone!) than anything else;
no single concrete proposal to be refined, and made workable.
> > My current thought is that it's wrong to expect the PM core code to
> > do all this. Discussion on this list strongly suggests to me that
> > we'll never achieve a Grand Unified Theory of PM into which every
> > device, bus, platform, and board will fit ... unless we give up
> > on the notion that _everything_ be centrally controlled. The key
> > will be having good ways to delegate all the important issues.
>
> The problem with your approach is since you have in mind those extremely
> weird embedded setups that can't fit in any sane model, you decide to
> reject any model that does anything useful at the core. I don't agree
> with your logic.
I hardly know where to start, given that baseless misrepresentation
of what I said. I objected to centralizing "_everything_" and you
replaced those commonsense words with wild allegations about sanity.
Calm down; switch to de-caff...
Well, considering that I'm using examples from PCI and USB too,
I certainly couldn't characterize what I "have in mind" as all
"extremely wierd embedded setups". And for that matter, taking
examples from widely used ARM hardware really doesn't seem to
be "extremely wierd" ... though it's certainly embedded.
I'm just expecting that if there's going to be a common Linux
PM framework, it had better work for more than PCI and swsusp.
Even on typical PC hardware (which includes Mac/PPC!!), stuff like
USB wakeup events seems "extremely *mainstream*" in terms of what
the hardware does.
And the fact that some of those USB models fit well with more
embedded approaches ... hmm, interesting. Paying attention to
those issues should thus be a general win...
> > So for example it would make sense to have device suspend() logic
> > verify any dependencies for the target state, and then just fail
> > cleanly if they're not satisfied.
>
> How ? It doesn't have the necessary knowledge unless we start moving a
> lot of burden into drivers. And every time we'll move some algorithmic
> requirements like that to drivers, 99% of them will get it wrong.
99% of them have no dependencies to worry about!! The remaining ones
are bridge drivers, which are already deeply worried about such stuff.
(And not getting a heck of a lot of support from Linux PM, either...)
(Oh, and only 83.7% of statistics are made up.)
> > That'll mostly be an issue for bridge drivers ... like PCI
> > bridges, host adapters for things like USB, hubs, and so on.
>
> Bridges/busses have no knowledges of device states, they can't verify
> dependencies unless the devices expose their state list & dependency
> information in a meaningful way, which is exactly what I'm proposing.
I've proposed similar things too. For those reasons; and others.
> > But it also shows a way to handle custom hardware, which may
> > not be as regular as the PC and server centric developers want
> > the world to be...
>
> Yah, I know those, I have been doing embedded dev too, and in most
> cases, they aren't _that_ bad. And if they are ? well, they'll hack
> around like they do today.
No, not that bad. But if the PM framework tries to delude itself
into thinking everything fits into one nice neat dependency tree, it
makes those solutions harder, uglier, and more error prone than
necessary. The same kind of stuff happens on PCs too.
> > So, two types of request to drivers then. The main one would be to
> > become compatible with a given system power state; flexible. The
> > inflexible one would be to go into a specific device power state.
>
> But a device power state has an impact on childs of this device, thus
> the dependency issue. I am not talking about system states here. Unless
> you want to completely phase out dyamic PM of busses...
If you've hogtied your system by forcing some devices ("busses") into
certain states that prevent others from working, it seems only fair to
me that it stay hogtied.
I suspect you're actually agreeing with me there that some of the
drivers need flexibility to manage their power states. And maybe
even that such modes will be the main ones of interest...
My answer to the question of how those parent/child dependency
details should be managed was to ensure that the parent can do
what it needs to. That is, decentralize those issues. I don't
understand why you seem to dislike that approach, when so many
of your examples seem to confirm it would work.
> > The handful of drivers that deal with dependencies can be responsible
> > for that ordering. They should be able to delegate much of the work
> > to PM core code (else why have a PM core?).
>
> No, I don't agree. That would be huge code duplication with different
> kinds of bugs in <name all busses in linux, and that is a LOT>
Now you're assuming that "delegate to the PM core" isn't fundamentally
about reusing only the code that _can_ be common. Why is that? It's
pretty opposite to how I read those words. There's going to have to
be bus-specific ("bridge-specific") code. And we need a way to draw
boundaries between that and the more reusable code in pmcore.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231810400.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 2:52 ` David Brownell
0 siblings, 0 replies; 72+ messages in thread
From: David Brownell @ 2005-03-24 2:52 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 2071 bytes --]
On Wednesday 23 March 2005 6:13 pm, Patrick Mochel wrote:
>
> On Wed, 23 Mar 2005, David Brownell wrote:
>
> > On Tuesday 22 March 2005 4:52 pm, Patrick Mochel wrote:
> > > >
> > > I think the core should always call ->suspend() for a device, regardless
> > > of whether it thinks it's in a low power state, or inactive. This is
> > > specifically for the reason that a device could be a low-power runtime
> > > state when the system is suspended.
> >
> > I don't quite see a need for this. If the parent/bridge driver knows
> > the device is adequately suspended, why kick it again? It's actually
> > rather annoying -- and error/bug prone! -- to have to code drivers to
> > detect and cope with superfluous suspend calls.
>
> The call doesn't need to actually kick the device. It just needs to check
> some state field in the device object. My point was that the core
> shouldn't differentiate between not-suspended and suspended devices.
It just has devices that it calls suspend() for, and resume() for?
That could make sense. Note that if it's not differentiating, then
it's not checking a state field... that task should IMO probably be
delegated to something that understands the relevant state fields.
(Like PCI code for PCI_D* states, USB code for USB states, etc.)
> Note that this point would be moot if we moved to a complete bus-centric
> view of the tree. We wouldn't care about individual devices; we would just
> call the bridge drivers, which could do their own checking on which
> devices were suspended and needed to be called. Note that that would be
> very trivial with lists of devices in each particular state, like I just
> suggested. :)
Well the data structure issues could be discussed a lot, but I basically
like that notion of pushing intelligence about "what is PM" into the
components that actually deal with hardware ... bridges, busses, hubs,
nexi, whatever you want to call them.
We could call that "bus-centric" if you like, though there will be other
kinds of components with the intelligence too.
- Dave
>
> Pat
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231838570.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 3:16 ` Nigel Cunningham
[not found] ` <1111634182.3430.1.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Nigel Cunningham @ 2005-03-24 3:16 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 1559 bytes --]
Hi Patrick.
On Thu, 2005-03-24 at 13:40, Patrick Mochel wrote:
> On Wed, 23 Mar 2005, Pavel Machek wrote:
>
> > Hi!
> >
> > > > There are many drivers that do not fit your idea of "driver". Like
> > > > mtrrs. Some drivers only ever do work on resume, etc. Forcing driver
> > > > to think how to split it into class->stop, bus->save_state and
> > > > bus->enter state is bad idea. [Notice that almost no drivers need
> > > > ->save_state operation...]
> > >
> > > I don't think MTRRs should be counted as drivers. Rather, they should be
> > > counted as part of the CPU state(s), to be saved and restored when CPU
> > > context is saved and restored. Treating them as drivers leads to races
> > > :>
> >
> > Okay, I'll need to make cpu hotplug deal with them. But MTRR are not
> > the only strange device. Think timer, for example.
>
> That's a system device (like the current MTRR). In the name of sanity,
> please don't make exceptions to the model based on them or platform
> devices. Both of those models are FITH and need to be re-done.
>
> Tangentially, if anyone has decent ideas on how to better represent those
> types of devices, I'm very intersted in knowing. I'd like to see them
> fixed up and cleaned up in the next few months..
Just for clarity's sake, what are you thinking should happen to MTRR
support?
Regards,
Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574
Maintainer of Suspend2 Kernel Patches http://suspend2.net
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231742090.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 2:29 ` Benjamin Herrenschmidt
@ 2005-03-24 5:02 ` David Brownell
[not found] ` <200503232102.51132.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
1 sibling, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-24 5:02 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 4378 bytes --]
On Wednesday 23 March 2005 6:05 pm, Patrick Mochel wrote:
>
> On Wed, 23 Mar 2005, Benjamin Herrenschmidt wrote:
>
> >
> > We can't have the USB bus driver know all possible state of childs,
> > that's contrary to the whole idea of letting leaf devices have any state
> > they want.
I've found it useful to distinguish between the states that are visible
at a component's lower interface ("towards the hardware") from those that
are visible at its upper interface ("towards userspace").
That way it's easy to draw important distinctions ... like the fact that
each one probably has a different sysfs object (e.g. lower is PCI device,
upper is network interface) ... and in PM terms, they don't need to expose
the same model. For example, the model that the kernel works with will
not necessarily be appropriate to conflate with application-visible states.
> > _However_, since child devices do know the parent state (the
> > USB bus states have been clearly defined), they can have in their state
> > array, a dependency indication indicating that they have to be put into
> > state Y when the parent goes into state X or deeper (I want state to be
> > ordered to make things easier).
>
> Are the leaf devices ever going to enter some random, ill-defined state?
I'd expect that all states manipulated by the kernel would be well defined,
but the kernel won't directly manage all power states. Examples of this
include "hdparm" managing disk idle timeouts and "xdpyinfo" managing DPMS
ones; the kernel just passes requests through. (And lest anyone forget,
those two examples can represent as much power as some CPUs...)
Likewise there will be other application state not known to the kernel.
> While a device could enter a number of states, that set seems finite.
> Correct me if I'm wrong, I only know PM from a PCI perpsective.
>
> For PCI, there are 4 possible states a device could be in (ok 5, counting
> D3-cold).
I'd count six PCI states. The five defined in the PCI PM spec, and
a sixth for "legacy PM" models ... sort of like PCI_D0, but AFAICT
not really as carefully defined.
> How many power states are there in USB?
Two: active and suspended. Active devices can draw a configurable
amount of power from VBUS (normally 100mA, up to 500mA), and they
receive traffic; normally a packet every millisecond. Suspended ones
draw much less current from VBUS (normally 0.5mA, up to 2.5mA) and
receive no traffic.
USB devices can also issue wakeup events, in common cases.
> It would be trivial to add a set of lists to each bridge driver to hold
> each device that is in a particular state.
Bridge-ish things I've had occasion to look at normally have on the
order of half a dozen children; rarely even a dozen. I'm not sure
having a list per state would help much... a simple list-of-children
data structure should suffice.
> > - This is a complex problem
> > - I'd rather have the complexity in a single place (the core) and keep
> > the driver side as simple as possible
> >
> > I think we would fix a lot of our problems if we had a notion of a bus
> > tree iterator. When the PM code needs to iterate the tree, it creates
> > the iterator object which registers itself somewhere.
>
> I agree, and it's easy enough to think of things with a bus-centric view.
> But, how does that add complexity to the core? I envision the core doing
> something like this:
>
> - Keep a hierarchical list of buses
> - Iterate over buses to put them to sleep
I'm inclined to Pat's perspective here. Although I don't really see
any reason to treat busses different from anything else that's got
child devices (as Benjamin has pointed out too).
> If we kept it at that, we could just call down to the bridge drivers and
> have them iterate over the devices on their bus to suspend them. This
> would push all the handling of leaf devices to the bus subsystems
> themselves. That would keep the core simple, not matter to the leaf device
> drivers, and place the burden on the bridge drivers.
Benjamin didn't much like it much at all when I proposed that ... :)
> The bridge driver largely don't exist (except for USB hubs), the
> requirements aren't very tough, and it would localize the semantics where
> they need to be - in the bus subsystems.
Yes to localizing semantics!! Though as for requirements, that's
not always true.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503231845.55392.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-24 5:03 ` Benjamin Herrenschmidt
2005-03-24 5:27 ` David Brownell
0 siblings, 1 reply; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-24 5:03 UTC (permalink / raw)
To: David Brownell; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]
On Wed, 2005-03-23 at 18:45 -0800, David Brownell wrote:
> If you've hogtied your system by forcing some devices ("busses") into
> certain states that prevent others from working, it seems only fair to
> me that it stay hogtied.
No. For example, I'm a host controller. I notice I didn't get any
request for a while, I want to enter a suspended state. That means going
through dependencies of my childs so they can all enter a state
compatible with me going to suspend.
> I suspect you're actually agreeing with me there that some of the
> drivers need flexibility to manage their power states. And maybe
> even that such modes will be the main ones of interest...
>
> My answer to the question of how those parent/child dependency
> details should be managed was to ensure that the parent can do
> what it needs to. That is, decentralize those issues. I don't
> understand why you seem to dislike that approach, when so many
> of your examples seem to confirm it would work.
I want to have the driver in control, yes. But I also want to have a
core that removes the burden from driver writers in the "generic" cases.
It's all a tradeoff to find :)
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503232102.51132.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-24 5:14 ` Benjamin Herrenschmidt
2005-03-24 5:31 ` David Brownell
2005-03-24 8:16 ` Patrick Mochel
0 siblings, 2 replies; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-24 5:14 UTC (permalink / raw)
To: David Brownell; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1597 bytes --]
On Wed, 2005-03-23 at 21:02 -0800, David Brownell wrote:
> > If we kept it at that, we could just call down to the bridge drivers and
> > have them iterate over the devices on their bus to suspend them. This
> > would push all the handling of leaf devices to the bus subsystems
> > themselves. That would keep the core simple, not matter to the leaf device
> > drivers, and place the burden on the bridge drivers.
>
> Benjamin didn't much like it much at all when I proposed that ... :)
Yes and no ... I dislike the word 'bus' here as I think we are drawing
an incorrect difference between a bus and a device, but it seems you
agree from your previous comments.
I think a model where we call the parent enter_state(), and that parent
is responsible to do all of the dependency resolving (including changing
child states) within his enter_state() call is nice.
However, I'm wondering what will be the stack usage of such a model on
deep bus layouts...
> > The bridge driver largely don't exist (except for USB hubs), the
> > requirements aren't very tough, and it would localize the semantics where
> > they need to be - in the bus subsystems.
>
> Yes to localizing semantics!! Though as for requirements, that's
> not always true.
I don't agree with the bus subsystem beeing a good place here. I don't
like it for lots of reasons and never liked it, because it totally lacks
the notion of a bus "instance" among others... It's de-facto a device
"bus type", so it's a "bus type" subsystem more than a "bus subsystem"
imho. But careful explanation might still change my mind here.
Ben.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <20050323123725.201d8a67-aftB2sG12IhaqnLngUycEA@public.gmane.org>
@ 2005-03-24 5:16 ` David Brownell
0 siblings, 0 replies; 72+ messages in thread
From: David Brownell @ 2005-03-24 5:16 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 2519 bytes --]
On Wednesday 23 March 2005 11:37 am, Jordan Crouse wrote:
> On Wed, 23 Mar 2005 10:58:54 -0800
> "David Brownell" <david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org> wrote:
>
> > So, two types of request to drivers then. The main one would be to
> > become compatible with a given system power state; flexible. The
> > inflexible one would be to go into a specific device power state.
>
> I like the idea of the flexible request. This would add the ability for
> the policy to individually manage devices that for one reason or another
> cannot or should not enter a power state compatible with the system
> power state.
Why wouldn't "can't enter compatible state" be an error case, preventing
entry to that system power state?
> As an illustrative example, I'm thinking of a fictitious VoIP phone with
> an audio device that has an abnormally large latency resuming from a
> clocks off power state, so by the time it wakes up and is ready to go,
> the incoming call is lost.
I'll pick on that example though: the audio output and radio input
will be separate devices, with separate management. Audio won't be
providing a reason to drop the call; that'd be specific to the radio
handling code.
> At this point, the user/developer/designer
> could decide that the the extra power consumed by leaving the clocks on
> is less important then having a responsive device, and the policy is set
> so that the device only enters a D1 state with a suspend-to-ram system
> power state, rather then a D3 state as it normally would (pardon the
> PCI/ACPI terms, they're just for simplicity).
There's no problem with different devices having different power
states. They're different devices for a reason, after all!
The radio needs to be active enough to detect it's the target
of a call, and issue a system wakeup event. Then the rest of
the system needs to be able to wake up quickly enough not to
make the user doze off. :)
> In that case, even though the device state wouldn't technically be
> compatible with the given system state, it would still be the
> best fit for the platform as a whole.
No, it would by definition be compatible. Otherwise the system
couldn't enter that state! There's no rule saying that all
PCI devices must be in PCI_D3 state when the system is in S3.
PCI D1 or D2 states can be more appropriate ... Ben has lots
of video driver examples (devices are sometimes reset in D3, and
maybe Linux relies on BIOS init), and there are other cases too.
- Dave
> Jordan
>
>
>
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-24 5:03 ` Benjamin Herrenschmidt
@ 2005-03-24 5:27 ` David Brownell
[not found] ` <200503232127.19576.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-24 5:27 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1924 bytes --]
On Wednesday 23 March 2005 9:03 pm, Benjamin Herrenschmidt wrote:
> On Wed, 2005-03-23 at 18:45 -0800, David Brownell wrote:
>
> > If you've hogtied your system by forcing some devices ("busses") into
> > certain states that prevent others from working, it seems only fair to
> > me that it stay hogtied.
>
> No. For example, I'm a host controller. I notice I didn't get any
> request for a while, I want to enter a suspended state. That means going
> through dependencies of my childs so they can all enter a state
> compatible with me going to suspend.
That's a different example though: you've given the host controller
flexibility. You have _not_ hogtied it.
The model we seem to be aiming towards in USB land is a bit different
than that though. When autosuspend is the goal, it bubbles up from
the bottom ... nodes (like HC) don't force children into idle, they
wait for the children to idle themselves and then take the opportunity
to snooze themselves. That's a model with wide applicability...
> > I suspect you're actually agreeing with me there that some of the
> > drivers need flexibility to manage their power states. And maybe
> > even that such modes will be the main ones of interest...
> >
> > My answer to the question of how those parent/child dependency
> > details should be managed was to ensure that the parent can do
> > what it needs to. That is, decentralize those issues. I don't
> > understand why you seem to dislike that approach, when so many
> > of your examples seem to confirm it would work.
>
> I want to have the driver in control, yes. But I also want to have a
> core that removes the burden from driver writers in the "generic" cases.
> It's all a tradeoff to find :)
And I don't want to aim for an all-singing/all-dancing pm core that
really can't deliver what's needed, because it's trying to centralize
things that are easier to address in a decentralized way. :)
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-24 5:14 ` Benjamin Herrenschmidt
@ 2005-03-24 5:31 ` David Brownell
2005-03-24 8:16 ` Patrick Mochel
1 sibling, 0 replies; 72+ messages in thread
From: David Brownell @ 2005-03-24 5:31 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1277 bytes --]
On Wednesday 23 March 2005 9:14 pm, Benjamin Herrenschmidt wrote:
>
> However, I'm wondering what will be the stack usage of such a model on
> deep bus layouts...
I've previously sketched algorithms that will linearize things.
It's what I referred to when I talked about dynamically building
the lists of devices to suspend or resume.
> > > The bridge driver largely don't exist (except for USB hubs), the
> > > requirements aren't very tough, and it would localize the semantics where
> > > they need to be - in the bus subsystems.
> >
> > Yes to localizing semantics!! Though as for requirements, that's
> > not always true.
>
> I don't agree with the bus subsystem beeing a good place here. I don't
> like it for lots of reasons and never liked it, because it totally lacks
> the notion of a bus "instance" among others... It's de-facto a device
> "bus type", so it's a "bus type" subsystem more than a "bus subsystem"
> imho. But careful explanation might still change my mind here.
Yes, there's bad terminology lurking. When I talk about a bus, I'm
rarely talkig about a "bus type" like PCI or USB ... I'm talking about
a specific instance. Strongly agreed that the relevant scope is an
instance ("just another device, with children") not a type/class.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503232127.19576.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-24 6:02 ` Benjamin Herrenschmidt
2005-03-24 6:31 ` David Brownell
0 siblings, 1 reply; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-24 6:02 UTC (permalink / raw)
To: David Brownell; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 593 bytes --]
> That's a different example though: you've given the host controller
> flexibility. You have _not_ hogtied it.
>
> The model we seem to be aiming towards in USB land is a bit different
> than that though. When autosuspend is the goal, it bubbles up from
> the bottom ... nodes (like HC) don't force children into idle, they
> wait for the children to idle themselves and then take the opportunity
> to snooze themselves. That's a model with wide applicability...
It is, though it requires every children driver to have an idle
mecanism ... do you think that will work in practice ?
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-24 6:02 ` Benjamin Herrenschmidt
@ 2005-03-24 6:31 ` David Brownell
[not found] ` <200503232231.00561.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-24 6:31 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 1408 bytes --]
On Wednesday 23 March 2005 10:02 pm, Benjamin Herrenschmidt wrote:
>
> > That's a different example though: you've given the host controller
> > flexibility. You have _not_ hogtied it.
> >
> > The model we seem to be aiming towards in USB land is a bit different
> > than that though. When autosuspend is the goal, it bubbles up from
> > the bottom ... nodes (like HC) don't force children into idle, they
> > wait for the children to idle themselves and then take the opportunity
> > to snooze themselves. That's a model with wide applicability...
>
> It is, though it requires every children driver to have an idle
> mecanism ... do you think that will work in practice ?
When autosuspend is the goal, they'll implement it. Otherwise it's
not a goal, so not having it will not matter at all. And for leaf
node drivers, it's not at all tricky. :)
It matters for example with mice on laptops. I'm told that Intel
has measured and found that autosuspending mouse, then root hub
lets Centrino enter the C3 state, saving 2 Watts of power. Which
can be rather significant savings...
There are other strategies too, like having some external component
try to decide things. Maybe even users. Every strategy has plus
and minus points. One nice thing about autosuspend is that the
user interface is all but nonexistent. Also, most users are already
trained to expect such mechanisms elsewhere.
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503232231.00561.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-24 6:36 ` Benjamin Herrenschmidt
2005-03-24 7:46 ` David Brownell
0 siblings, 1 reply; 72+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-24 6:36 UTC (permalink / raw)
To: David Brownell; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 2399 bytes --]
On Wed, 2005-03-23 at 22:31 -0800, David Brownell wrote:
> On Wednesday 23 March 2005 10:02 pm, Benjamin Herrenschmidt wrote:
> >
> > > That's a different example though: you've given the host controller
> > > flexibility. You have _not_ hogtied it.
> > >
> > > The model we seem to be aiming towards in USB land is a bit different
> > > than that though. When autosuspend is the goal, it bubbles up from
> > > the bottom ... nodes (like HC) don't force children into idle, they
> > > wait for the children to idle themselves and then take the opportunity
> > > to snooze themselves. That's a model with wide applicability...
> >
> > It is, though it requires every children driver to have an idle
> > mecanism ... do you think that will work in practice ?
>
> When autosuspend is the goal, they'll implement it. Otherwise it's
> not a goal, so not having it will not matter at all. And for leaf
> node drivers, it's not at all tricky. :)
Still... it means a usage timer etc... on every device ... I just happen
to quite like the idea of the host controller "noticing" no URBs were
used so far :) Also, what about devices with no driver attached (nor
userland drivers) ?
> It matters for example with mice on laptops. I'm told that Intel
> has measured and found that autosuspending mouse, then root hub
> lets Centrino enter the C3 state, saving 2 Watts of power. Which
> can be rather significant savings...
It is, though I have no idea what C3 is ... But suspending root hub
definitely stops the hcca updates, so lets the CPU and host bridge rest,
so it's definitely a good thing. I should measure that on my laptop one
of these days. It's especially interesting on those new pmac laptops
with internal bluetooth as it's basically preventing auto-suspend of the
root hub currently (until there is that idle suspend mecanism
implemented, wether it is at the ohci level or at the bluetooth
level)... oh and the newest ones also have USB keyboards and
trackpads ...
> There are other strategies too, like having some external component
> try to decide things. Maybe even users. Every strategy has plus
> and minus points. One nice thing about autosuspend is that the
> user interface is all but nonexistent. Also, most users are already
> trained to expect such mechanisms elsewhere.
Yup.
--
Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-24 6:36 ` Benjamin Herrenschmidt
@ 2005-03-24 7:46 ` David Brownell
0 siblings, 0 replies; 72+ messages in thread
From: David Brownell @ 2005-03-24 7:46 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 3350 bytes --]
On Wednesday 23 March 2005 10:36 pm, Benjamin Herrenschmidt wrote:
> On Wed, 2005-03-23 at 22:31 -0800, David Brownell wrote:
> > On Wednesday 23 March 2005 10:02 pm, Benjamin Herrenschmidt wrote:
> > >
> > > > That's a different example though: you've given the host controller
> > > > flexibility. You have _not_ hogtied it.
> > > >
> > > > The model we seem to be aiming towards in USB land is a bit different
> > > > than that though. When autosuspend is the goal, it bubbles up from
> > > > the bottom ... nodes (like HC) don't force children into idle, they
> > > > wait for the children to idle themselves and then take the opportunity
> > > > to snooze themselves. That's a model with wide applicability...
> > >
> >...
>
> Still... it means a usage timer etc... on every device ... I just happen
> to quite like the idea of the host controller "noticing" no URBs were
> used so far :)
There are probably cases where some reusable logic could kick in.
Maybe not for choice of timeout though. :)
Think of that mouse example though. It's got an urb posted at all
times. Polling. The host controller will not observe the "idle"
nature. And it wouldn't be able to distinguish mice that can do
remote wakeup (so they're OK to suspend) from those that can't.
> Also, what about devices with no driver attached (nor
> userland drivers) ?
No driver? Suspend it always. (But be ready to resume and
maybe reset the device before probing it...)
Userland driver? That means the driver is "usbfs". It'd
make sense to have userspace initiate the suspend then.
> > It matters for example with mice on laptops. I'm told that Intel
> > has measured and found that autosuspending mouse, then root hub
> > lets Centrino enter the C3 state, saving 2 Watts of power. Which
> > can be rather significant savings...
>
> It is, though I have no idea what C3 is ...
Yet another CPU power state, but one incompatible with DMAs
that come by every millisecond. It may be Intel-only for all
I know, meaning only UHCI (and EHCI) will care.
> But suspending root hub
> definitely stops the hcca updates, so lets the CPU and host bridge rest,
> so it's definitely a good thing. I should measure that on my laptop one
> of these days.
That's part of why I added that idle-suspend to OHCI last year
sometime. Also, because on some hardware that's the only way
to conserve power short of turning off clocks and dropping the
VBUS power.
> It's especially interesting on those new pmac laptops
> with internal bluetooth as it's basically preventing auto-suspend of the
> root hub currently (until there is that idle suspend mecanism
> implemented, wether it is at the ohci level or at the bluetooth
> level)... oh and the newest ones also have USB keyboards and
> trackpads ...
I didn't know that. Interesting. Good thing the OHCI driver seems
pretty solid nowadays ... PM aside! :)
- Dave
> > There are other strategies too, like having some external component
> > try to decide things. Maybe even users. Every strategy has plus
> > and minus points. One nice thing about autosuspend is that the
> > user interface is all but nonexistent. Also, most users are already
> > trained to expect such mechanisms elsewhere.
>
> Yup.
>
> --
> Benjamin Herrenschmidt <benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
>
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
2005-03-24 5:14 ` Benjamin Herrenschmidt
2005-03-24 5:31 ` David Brownell
@ 2005-03-24 8:16 ` Patrick Mochel
1 sibling, 0 replies; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 8:16 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: David Brownell, Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2944 bytes --]
On Thu, 24 Mar 2005, Benjamin Herrenschmidt wrote:
> On Wed, 2005-03-23 at 21:02 -0800, David Brownell wrote:
>
> > > If we kept it at that, we could just call down to the bridge drivers and
> > > have them iterate over the devices on their bus to suspend them. This
> > > would push all the handling of leaf devices to the bus subsystems
> > > themselves. That would keep the core simple, not matter to the leaf device
> > > drivers, and place the burden on the bridge drivers.
> >
> > Benjamin didn't much like it much at all when I proposed that ... :)
>
> Yes and no ... I dislike the word 'bus' here as I think we are drawing
> an incorrect difference between a bus and a device, but it seems you
> agree from your previous comments.
>
> I think a model where we call the parent enter_state(), and that parent
> is responsible to do all of the dependency resolving (including changing
> child states) within his enter_state() call is nice.
Uh, I think we're orbiting around the same thing. What you call a
'parent', I'm calling a 'bridge'. The set of the devices on it is a 'bus',
which is easily confused with the bus subsystems (USB, PCI). They're all
bad, overloaded names.
Is there a standard name for a node in a tree that is not a leaf (besides
non-leaf node)?
What I was saying before is that the core only needs to keep track of the
non-leaf nodes. It only needs to call the devices that do (or could) have
children, which will in turn call its children devices to suspend them. I
*think* it's the same thing you suggest.
> However, I'm wondering what will be the stack usage of such a model on
> deep bus layouts...
You don't need to recurse. The core doesn't recurse now, but keeps a list
in the proper order hierarchical order. While it's arguable that it's not
right for non-traditional power domains, it could easily be resolved,
without recursion. Even without ordering, there are algorithms in the
kernel that descend through trees without recursing (see fs/dentry.c).
> > > The bridge driver largely don't exist (except for USB hubs), the
> > > requirements aren't very tough, and it would localize the semantics where
> > > they need to be - in the bus subsystems.
> >
> > Yes to localizing semantics!! Though as for requirements, that's
> > not always true.
>
> I don't agree with the bus subsystem beeing a good place here. I don't
> like it for lots of reasons and never liked it, because it totally lacks
> the notion of a bus "instance" among others... It's de-facto a device
> "bus type", so it's a "bus type" subsystem more than a "bus subsystem"
> imho. But careful explanation might still change my mind here.
We can have bus instances by default of what is described above. I don't
think bus instances are a terrible idea. We never kept them in the core
because they were ill-defined and not used. Since they've never been added
back, that notion is backed up. They could easily be added back..
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <1111634182.3430.1.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
@ 2005-03-24 8:19 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240017460.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 8:19 UTC (permalink / raw)
To: Nigel Cunningham; +Cc: Linux-pm mailing list, Pavel Machek
[-- Attachment #1: Type: TEXT/PLAIN, Size: 386 bytes --]
On Thu, 24 Mar 2005, Nigel Cunningham wrote:
> Just for clarity's sake, what are you thinking should happen to MTRR
> support?
Become more componentized. We need a better way to represent optional
features of any device, but in particular CPUs. We should never be calling
it directly; we should be looping over the feature drivers bound to a 'CPU
Driver' and suspending them.
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231724100.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 9:59 ` Pavel Machek
[not found] ` <20050324095910.GD1354-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Pavel Machek @ 2005-03-24 9:59 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 1241 bytes --]
Hi!
> > > > Note that rather than enter_state, I'd rather just have a function
> > > > pointer enter_this_state in the driver state array ...
> > >
> > > Wouldn't that imply a different ->enter_state() method for each system
> > > state?
> >
> > one enter state method for each driver state. If the driver has one
> > enter state for each system state, then go for it.
>
> Two things:
>
> 1) I meant just 1 ->enter_state() entry point for the core to call. It
> won't call a different function depending on the state; it will leave it
> up to the driver to determine what state to enter/what function to call.
> Internally (or in its bus core), is where the array of enter_state()
> methods would reside. Do you agree?
>
> 2) The system states are totally dependent on the platform. I don't see
> how we could have a sane array that encapsulates every possible system
> state. Thoughts?
Seems to me that suspend-to-RAM and suspend-to-DISK make sense, and
cover 90% of what people want to do. Add "standby" for "fast
suspend-to-RAM" and you cover even ARM. I'd say that's good enuogh.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* CPU local things [was Re: Nested suspends; messages vs. states]
[not found] ` <Pine.LNX.4.50.0503240017460.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 10:01 ` Pavel Machek
[not found] ` <20050324100153.GE1354-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Pavel Machek @ 2005-03-24 10:01 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Nigel Cunningham, Linux-pm mailing list
[-- Attachment #1: Type: text/plain, Size: 753 bytes --]
Hi!
> > Just for clarity's sake, what are you thinking should happen to MTRR
> > support?
>
> Become more componentized. We need a better way to represent optional
> features of any device, but in particular CPUs. We should never be calling
> it directly; we should be looping over the feature drivers bound to a 'CPU
> Driver' and suspending them.
Actually, no.
mtrr's are cpu local. That means they need to be handled by CPU
hotplug framework. I guess we should just drop them from "normal"
device trees, and create something per-CPU.
Perhaps plain old notification list is enough for this one.
Pavel
--
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <20050324095910.GD1354-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-24 15:48 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240746290.24692-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 15:48 UTC (permalink / raw)
To: Pavel Machek; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 540 bytes --]
On Thu, 24 Mar 2005, Pavel Machek wrote:
> Seems to me that suspend-to-RAM and suspend-to-DISK make sense, and
> cover 90% of what people want to do. Add "standby" for "fast
> suspend-to-RAM" and you cover even ARM. I'd say that's good enuogh.
We can treat STR and Standby nearly identically, exceptions being what we
do with the CPU, and what power states the devices enter. But, are there
any others?
What about this "big sleep" and "deep sleep" stuff? What platform were
those for? Is there documentation about them?
Thanks,
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: CPU local things [was Re: Nested suspends; messages vs. states]
[not found] ` <20050324100153.GE1354-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-24 15:59 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240749030.24692-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 15:59 UTC (permalink / raw)
To: Pavel Machek; +Cc: Nigel Cunningham, Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2281 bytes --]
On Thu, 24 Mar 2005, Pavel Machek wrote:
> Hi!
>
> > > Just for clarity's sake, what are you thinking should happen to MTRR
> > > support?
> >
> > Become more componentized. We need a better way to represent optional
> > features of any device, but in particular CPUs. We should never be calling
> > it directly; we should be looping over the feature drivers bound to a 'CPU
> > Driver' and suspending them.
>
> Actually, no.
>
> mtrr's are cpu local. That means they need to be handled by CPU
> hotplug framework. I guess we should just drop them from "normal"
> device trees, and create something per-CPU.
Sorry, it was late and that explanation sucked.
- Every CPU has a set of optional features that it supports.
- MTRRs are an optional feature that a CPU may support.
- When the MTRR driver is loaded, a data structure should be allocated for
each CPU and added to a list.
- The list that the per-CPU MTRR data structure is added to could be part
of a 'CPU driver'.
- We should be looping over the set of optional features that a CPU
supports to suspend/resume them, rather than calling them directly.
Agree?
I agree that the CPU hotplug framework should help, since the save/restore
of MTRRs is needed for that feature as well.
I don't think they could/should go into normal device tree. They are part
of the special class of 'system' devices that don't support the notion of
a traditional driver and have to be handled at special times (when
interrupts are disabled) during the suspend/resume process.
I do think that the current model of system + platform devices kinda
stinks, but I do not think that merging them into the normal device tree
is the right solution.
> Perhaps plain old notification list is enough for this one.
It's possible, but notification lists present some problems. Like the fact
they use a hard-coded set of events in a global header file. They are good
only for a certain set of events.
It's damn simple to create a struct type for CPU features and a method
contained in each one for cpu offline/online. I would suggest adding a
list_head to struct cpu (include/linux/cpu.h) called 'features', then
having things like MTRR add themselves to that list.
But, that's just my $0.02 (which is greatly devalued these days :)
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503240746290.24692-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 16:38 ` David Brownell
[not found] ` <200503240838.37628.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-24 16:38 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg; +Cc: Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 1683 bytes --]
On Thursday 24 March 2005 7:48 am, Patrick Mochel wrote:
>
> On Thu, 24 Mar 2005, Pavel Machek wrote:
>
> > Seems to me that suspend-to-RAM and suspend-to-DISK make sense, and
> > cover 90% of what people want to do. Add "standby" for "fast
> > suspend-to-RAM" and you cover even ARM. I'd say that's good enuogh.
>
> We can treat STR and Standby nearly identically, exceptions being what we
> do with the CPU, and what power states the devices enter. But, are there
> any others?
Quite possibly. Also look at the Montavista DPM stuff, where the
operating points are a superset of those CPU states ...
> What about this "big sleep" and "deep sleep" stuff? What platform were
> those for? Is there documentation about them?
The terms apply at least to OMAP, and there's plenty of documentation
for those chips. Look at the OMAP 5912 docs:
http://focus.ti.com/omap/docs/omapgenpage.tsp?navigationId=12341&templateId=5663&path=templatedata/cm/omapproc/data/omap5912
You'll have to register to get those (vs an NDA for an almost-identical
version used inside cell phones -- yes, some Linux based ones too! -- which
is one-big-PDF instead of lots of little one-per-chapter ones) and then
the "Power Management" reference guide, SPRU753, talks about those details.
Other platforms could use the same names differently of course. Capsule
summary, "deep" means there's only a 32KHz clock, while "big" means the
48 MHz one is available to peripherals that need it (UARTs, USB, MMC/SD,
camera, and so forth).
Other ARM SOCs have similar distinctions; you might look at the Atmel
AT91rm9200 for one that's a lot simpler (and where Linux support isn't
quite as mature).
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503240838.37628.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-24 17:00 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240858200.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 17:00 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg, Pavel Machek
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1266 bytes --]
On Thu, 24 Mar 2005, David Brownell wrote:
> On Thursday 24 March 2005 7:48 am, Patrick Mochel wrote:
> > What about this "big sleep" and "deep sleep" stuff? What platform were
> > those for? Is there documentation about them?
>
> The terms apply at least to OMAP, and there's plenty of documentation
> for those chips. Look at the OMAP 5912 docs:
>
> http://focus.ti.com/omap/docs/omapgenpage.tsp?navigationId=12341&templateId=5663&path=templatedata/cm/omapproc/data/omap5912
>
> You'll have to register to get those (vs an NDA for an almost-identical
> version used inside cell phones -- yes, some Linux based ones too! -- which
> is one-big-PDF instead of lots of little one-per-chapter ones) and then
> the "Power Management" reference guide, SPRU753, talks about those details.
Thanks.
> Other platforms could use the same names differently of course. Capsule
> summary, "deep" means there's only a 32KHz clock, while "big" means the
> 48 MHz one is available to peripherals that need it (UARTs, USB, MMC/SD,
> camera, and so forth).
It sounds like they refer to low-power states in which the system is still
operating, which are distinct from the STD/STR/Standby that we're used to
that are non-operational low-power states. Is that correct?
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503231827310.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 17:03 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503241149000.1345-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Alan Stern @ 2005-03-24 17:03 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2478 bytes --]
On Wed, 23 Mar 2005, Patrick Mochel wrote:
> On Wed, 23 Mar 2005, Alan Stern wrote:
>
> > So what's wrong with changing it into:
> >
> > void suspend_to_ram()
> > {
> > lock all devices
> > suspend all devices
> > turn off main power
> > /* Zzzz... */
> > resume all devices
> > unlock all devices
> > }
> >
> > the possibility of devices being removed or added while the PM core is
> > traversing its lists. Locking does protect against that -- or rather, it
> > will once the mechanisms have been added to the driver model core.
>
> That would add a lot of complication. You never want to add more locks
> than you need to.
The semaphores themselves are required for other reasons, but I take your
point. You never want to lock more items than you need to.
> Without locking, each suspend operation is a single
> discete action with no dependencies on anything else. Having to lock every
> device now intertwines them all in a very complicated and disgusting
> manner.
Actually the locking rule is quite simple and elegant: Never lock a
device while holding a descendant's lock.
> It would make it potentially very hard to debug and add a lot of
> time to the process.
Hard to debug, maybe... we can't tell without actually trying. Adding a
lot of time to the suspend process, no. Acquiring the locks would block
only for things that should cause you to block anyway, like trying to
suspend a device while it's being probed.
(Or did you mean it would add a lot of time to the development process? I
doubt it, but it would force people to think about issues they would
prefer to ignore.)
> Note that you're locking ideas are at least original, but I'm having a
> harder and harder time taking them seriously without any code. I highly
> recommend that you at least try to codify the locking changes before
> making suggestions. It will weed out a lot of under-cooked ideas and get
> us a lot closer to a workable solution. As Linus would say "Show me the
> code!"
A large part of the concept is already coded up and part of the kernel
since about 2.6.9, and it works quite well. Its scope is currently
restricted to the USB layer; I'm proposing to make it more general.
In fact at this stage it's more a matter for discussion under the topic
of driver-model development, so I'm going to stop talking about it on
linux-pm. You indicated that you had some relevant driver-model patches
-- would you like to send them to me?
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.44L0.0503241149000.1345-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-24 17:13 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240904570.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 17:13 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 2561 bytes --]
On Thu, 24 Mar 2005, Alan Stern wrote:
> On Wed, 23 Mar 2005, Patrick Mochel wrote:
> > It would make it potentially very hard to debug and add a lot of
> > time to the process.
>
> Hard to debug, maybe... we can't tell without actually trying. Adding a
> lot of time to the suspend process, no. Acquiring the locks would block
> only for things that should cause you to block anyway, like trying to
> suspend a device while it's being probed.
It would change the locking from an O(1) operation to an O(n) operation,
where n is the number of devices. Taking any lock is not cheap, so taking
N locks, when is N is large is going to be grossly inefficient.
> (Or did you mean it would add a lot of time to the development process? I
> doubt it, but it would force people to think about issues they would
> prefer to ignore.)
Uh, I hadn't thought about it, but yes, that's true as well. While in
theory it's good to make people think about what they're doing, you want
to make things as simple as possible for people to implement ("Make it
impossible to get wrong."). Especially for driver writers, who typically
have the least experience with the Linux kernel, and in many cases, very
little experience with multi-processor systems and race conditions.
[ Note that it's still easy to get things wrong in the driver model, but
that's specifically where the effort is going - to make it harder to mess
up. ]
> > Note that you're locking ideas are at least original, but I'm having a
> > harder and harder time taking them seriously without any code. I highly
> > recommend that you at least try to codify the locking changes before
> > making suggestions. It will weed out a lot of under-cooked ideas and get
> > us a lot closer to a workable solution. As Linus would say "Show me the
> > code!"
>
> A large part of the concept is already coded up and part of the kernel
> since about 2.6.9, and it works quite well. Its scope is currently
> restricted to the USB layer; I'm proposing to make it more general.
Really? And it works well? Greg, David? What do you guys think of it?
> In fact at this stage it's more a matter for discussion under the topic
> of driver-model development, so I'm going to stop talking about it on
> linux-pm. You indicated that you had some relevant driver-model patches
> -- would you like to send them to me?
I apologize, I assumed you had seen them. You can find them here:
http://article.gmane.org/gmane.linux.kernel/289092
specifically the klist patch:
http://article.gmane.org/gmane.linux.kernel/289090
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: CPU local things [was Re: Nested suspends; messages vs. states]
[not found] ` <Pine.LNX.4.50.0503240749030.24692-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 17:14 ` Nathan Lynch
2005-03-24 20:59 ` Nigel Cunningham
0 siblings, 1 reply; 72+ messages in thread
From: Nathan Lynch @ 2005-03-24 17:14 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Nigel Cunningham, Linux-pm mailing list, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 1984 bytes --]
On Thu, Mar 24, 2005 at 07:59:21AM -0800, Patrick Mochel wrote:
>
> On Thu, 24 Mar 2005, Pavel Machek wrote:
>
> > Hi!
> >
> > > > Just for clarity's sake, what are you thinking should happen to MTRR
> > > > support?
> > >
> > > Become more componentized. We need a better way to represent optional
> > > features of any device, but in particular CPUs. We should never be calling
> > > it directly; we should be looping over the feature drivers bound to a 'CPU
> > > Driver' and suspending them.
> >
> > Actually, no.
> >
> > mtrr's are cpu local. That means they need to be handled by CPU
> > hotplug framework. I guess we should just drop them from "normal"
> > device trees, and create something per-CPU.
>
> Sorry, it was late and that explanation sucked.
>
> - Every CPU has a set of optional features that it supports.
>
> - MTRRs are an optional feature that a CPU may support.
>
> - When the MTRR driver is loaded, a data structure should be allocated for
> each CPU and added to a list.
>
> - The list that the per-CPU MTRR data structure is added to could be part
> of a 'CPU driver'.
>
> - We should be looping over the set of optional features that a CPU
> supports to suspend/resume them, rather than calling them directly.
Don't sysdev_suspend and sysdev_resume do this already?
> > Perhaps plain old notification list is enough for this one.
>
> It's possible, but notification lists present some problems. Like the fact
> they use a hard-coded set of events in a global header file. They are good
> only for a certain set of events.
>
> It's damn simple to create a struct type for CPU features and a method
> contained in each one for cpu offline/online. I would suggest adding a
> list_head to struct cpu (include/linux/cpu.h) called 'features', then
> having things like MTRR add themselves to that list.
Why is the existing sysdev auxiliary driver support not sufficient?
Last time I checked, cpufreq uses it for this kind of purpose.
Nathan
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503240858200.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 17:33 ` David Brownell
[not found] ` <200503240933.49123.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: David Brownell @ 2005-03-24 17:33 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg; +Cc: Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 992 bytes --]
On Thursday 24 March 2005 9:00 am, Patrick Mochel wrote:
> > Other platforms could use the same names differently of course. Capsule
> > summary, "deep" means there's only a 32KHz clock, while "big" means the
> > 48 MHz one is available to peripherals that need it (UARTs, USB, MMC/SD,
> > camera, and so forth).
>
> It sounds like they refer to low-power states in which the system is still
> operating, which are distinct from the STD/STR/Standby that we're used to
> that are non-operational low-power states. Is that correct?
Well, _some_ parts of the system are still operating. But that's
true of STR/Standby too ... some components are operating well
enough to issue wakeup events. (Even in some STD modes.) And
return from STR/Standby does't mean all device state is trashed;
those devices don't go "non-operational" as in "reset" either.
It may be fair to say that more parts are more functional though;
making detailed comparisons hasn't been high on my tasklist!
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <200503240933.49123.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-24 17:41 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240937150.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
0 siblings, 1 reply; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 17:41 UTC (permalink / raw)
To: David Brownell; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg, Pavel Machek
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1368 bytes --]
On Thu, 24 Mar 2005, David Brownell wrote:
> On Thursday 24 March 2005 9:00 am, Patrick Mochel wrote:
>
> > > Other platforms could use the same names differently of course. Capsule
> > > summary, "deep" means there's only a 32KHz clock, while "big" means the
> > > 48 MHz one is available to peripherals that need it (UARTs, USB, MMC/SD,
> > > camera, and so forth).
> >
> > It sounds like they refer to low-power states in which the system is still
> > operating, which are distinct from the STD/STR/Standby that we're used to
> > that are non-operational low-power states. Is that correct?
>
> Well, _some_ parts of the system are still operating. But that's
> true of STR/Standby too ... some components are operating well
> enough to issue wakeup events. (Even in some STD modes.) And
> return from STR/Standby does't mean all device state is trashed;
> those devices don't go "non-operational" as in "reset" either.
That's a bit of a stretch. While it's true that some devices can generate
wakeup interrupts, they are not servicing normal requests. Enabling the
wakeup events seems to imply disabling other services.
It sounds like in the case you speak of, the devices are still doing
'normal' work, but it also seems like that normal work is self-contained
in the device and doesn't need any code executed on a CPU to do so (like
via an ISR).
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503240904570.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 17:46 ` David Brownell
2005-03-24 17:51 ` Patrick Mochel
2005-03-24 19:27 ` Alan Stern
2 siblings, 0 replies; 72+ messages in thread
From: David Brownell @ 2005-03-24 17:46 UTC (permalink / raw)
To: linux-pm-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 1498 bytes --]
On Thursday 24 March 2005 9:13 am, Patrick Mochel wrote:
>
> On Thu, 24 Mar 2005, Alan Stern wrote:
>
> >
> > A large part of the concept is already coded up and part of the kernel
> > since about 2.6.9, and it works quite well. Its scope is currently
> > restricted to the USB layer; I'm proposing to make it more general.
>
> Really? And it works well? Greg, David? What do you guys think of it?
USB locking keeps tripping over the driver model. At all levels,
including driver binding, device configuration, port reset, remote
wakeup, disconnect processing, and suspend/resume. The bus rwsem
has been particularly troublesome for tree and subtree operations
(which include most of those operations I listed).
I'll be interested to hear Alan's feedback on your proposed patches.
We've kind of taken turns butting heads against the driver model,
and he's freshest.
By the way, a lock hierarchy isn't an O(N) proposition unless you're
aiming to acquire extra locks (wholesale).
- Dave
> > In fact at this stage it's more a matter for discussion under the topic
> > of driver-model development, so I'm going to stop talking about it on
> > linux-pm. You indicated that you had some relevant driver-model patches
> > -- would you like to send them to me?
>
> I apologize, I assumed you had seen them. You can find them here:
>
> http://article.gmane.org/gmane.linux.kernel/289092
>
> specifically the klist patch:
>
> http://article.gmane.org/gmane.linux.kernel/289090
>
>
> Pat
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503240904570.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 17:46 ` David Brownell
@ 2005-03-24 17:51 ` Patrick Mochel
2005-03-24 19:27 ` Alan Stern
2 siblings, 0 replies; 72+ messages in thread
From: Patrick Mochel @ 2005-03-24 17:51 UTC (permalink / raw)
To: Alan Stern; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1235 bytes --]
On Thu, 24 Mar 2005, Patrick Mochel wrote:
>
> On Thu, 24 Mar 2005, Alan Stern wrote:
>
> > On Wed, 23 Mar 2005, Patrick Mochel wrote:
>
> > > It would make it potentially very hard to debug and add a lot of
> > > time to the process.
> >
> > Hard to debug, maybe... we can't tell without actually trying. Adding a
> > lot of time to the suspend process, no. Acquiring the locks would block
> > only for things that should cause you to block anyway, like trying to
> > suspend a device while it's being probed.
>
> It would change the locking from an O(1) operation to an O(n) operation,
> where n is the number of devices. Taking any lock is not cheap, so taking
> N locks, when is N is large is going to be grossly inefficient.
Ok, I'm a hypocrit. :)
The first patch in the series that I posted adds a semaphore to struct
device that is taken before each driver operation, including suspend and
resume. It is dropped, though, when the operation is complete. I think
that's the right way to do it, and I'm interested to hear if that will
work for what you guys want to do.
I *do* realize that it is equivalent in terms of time spent acquiring and
releasing locks that I bashed above. I will eat my words now.
Thanks,
Pat
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503240937150.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
@ 2005-03-24 18:08 ` David Brownell
0 siblings, 0 replies; 72+ messages in thread
From: David Brownell @ 2005-03-24 18:08 UTC (permalink / raw)
To: Patrick Mochel; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 1312 bytes --]
On Thursday 24 March 2005 9:41 am, Patrick Mochel wrote:
>
>
> That's a bit of a stretch. While it's true that some devices can generate
> wakeup interrupts, they are not servicing normal requests. Enabling the
> wakeup events seems to imply disabling other services.
There's not much distinction between wakeup irqs and "normal" ones.
Normally they're one and the same. (That's not OMAP-specific.)
> It sounds like in the case you speak of, the devices are still doing
> 'normal' work, but it also seems like that normal work is self-contained
> in the device and doesn't need any code executed on a CPU to do so (like
> via an ISR).
Well, the IRQ would transition the SOC out of say "big sleep", then the
ISR would run. (I'm not sure whether DMAs can stay active in big sleep.)
Just this morning I saw an experimental patch teaching the smc91x Ethernet
driver -- external chip, hooked up to many OMAP devel boards -- to setup
its GPIO IRQ line as a wakeup IRQ. So the normal work of receiving
and transmitting packets using that chip's local buffers could go on while
the SOC went to sleep.
You're right, it sounds _just_ like that! :)
One reason I mentioned the AT91rm9200 is because it has a "slow clock mode"
where the CPU itself is running at 32KHz. Variations on a low-power theme.
- Dave
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: Nested suspends; messages vs. states
[not found] ` <Pine.LNX.4.50.0503240904570.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 17:46 ` David Brownell
2005-03-24 17:51 ` Patrick Mochel
@ 2005-03-24 19:27 ` Alan Stern
2 siblings, 0 replies; 72+ messages in thread
From: Alan Stern @ 2005-03-24 19:27 UTC (permalink / raw)
To: Patrick Mochel; +Cc: Linux-pm mailing list
[-- Attachment #1: Type: TEXT/PLAIN, Size: 542 bytes --]
On Thu, 24 Mar 2005, Patrick Mochel wrote:
> I apologize, I assumed you had seen them. You can find them here:
>
> http://article.gmane.org/gmane.linux.kernel/289092
>
> specifically the klist patch:
>
> http://article.gmane.org/gmane.linux.kernel/289090
Thanks. I don't subscribe to LKML -- tried it once and gave up after a
day or two. "Drinking from a fire hose" is a good comparison.
I'll post my comments on LKML (with CC: to you, David, and anyone else who
expresses an interest) after I've absorbed the patches.
Alan Stern
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: CPU local things [was Re: Nested suspends; messages vs. states]
2005-03-24 17:14 ` Nathan Lynch
@ 2005-03-24 20:59 ` Nigel Cunningham
0 siblings, 0 replies; 72+ messages in thread
From: Nigel Cunningham @ 2005-03-24 20:59 UTC (permalink / raw)
To: Nathan Lynch; +Cc: Linux-pm mailing list, Pavel Machek
[-- Attachment #1: Type: text/plain, Size: 2717 bytes --]
Hi Nathan.
On Fri, 2005-03-25 at 04:14, Nathan Lynch wrote:
> > > > Become more componentized. We need a better way to represent optional
> > > > features of any device, but in particular CPUs. We should never be calling
> > > > it directly; we should be looping over the feature drivers bound to a 'CPU
> > > > Driver' and suspending them.
> > >
> > > Actually, no.
> > >
> > > mtrr's are cpu local. That means they need to be handled by CPU
> > > hotplug framework. I guess we should just drop them from "normal"
> > > device trees, and create something per-CPU.
> >
> > Sorry, it was late and that explanation sucked.
> >
> > - Every CPU has a set of optional features that it supports.
> >
> > - MTRRs are an optional feature that a CPU may support.
> >
> > - When the MTRR driver is loaded, a data structure should be allocated for
> > each CPU and added to a list.
> >
> > - The list that the per-CPU MTRR data structure is added to could be part
> > of a 'CPU driver'.
> >
> > - We should be looping over the set of optional features that a CPU
> > supports to suspend/resume them, rather than calling them directly.
>
> Don't sysdev_suspend and sysdev_resume do this already?
I guess you missed part of the context of this discussion. MTRR sysdev
suspend and resume work fine for one CPU, but there's a potential SMP
deadlock. For Suspend2, MTRR sysdev support was removed a number of
months ago, and tied in to CPU state saving and restoring. This
addressed the deadlock, correctly, I believe.
Hope this helps.
Nigel
> > > Perhaps plain old notification list is enough for this one.
> >
> > It's possible, but notification lists present some problems. Like the fact
> > they use a hard-coded set of events in a global header file. They are good
> > only for a certain set of events.
> >
> > It's damn simple to create a struct type for CPU features and a method
> > contained in each one for cpu offline/online. I would suggest adding a
> > list_head to struct cpu (include/linux/cpu.h) called 'features', then
> > having things like MTRR add themselves to that list.
>
> Why is the existing sysdev auxiliary driver support not sufficient?
> Last time I checked, cpufreq uses it for this kind of purpose.
>
>
> Nathan
>
> ______________________________________________________________________
> _______________________________________________
> linux-pm mailing list
> linux-pm-qjLDD68F18O7TbgM5vRIOg@public.gmane.org
> http://lists.osdl.org/mailman/listinfo/linux-pm
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com
Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574
Maintainer of Suspend2 Kernel Patches http://suspend2.net
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 72+ messages in thread
end of thread, other threads:[~2005-03-24 20:59 UTC | newest]
Thread overview: 72+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-21 20:11 Nested suspends; messages vs. states Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503211436020.1241-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-21 20:20 ` Pavel Machek
[not found] ` <20050321202016.GI1390-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-21 21:14 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503211613010.2329-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-21 22:26 ` Pavel Machek
[not found] ` <20050321222609.GK1390-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-22 3:08 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503212140450.28689-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
2005-03-22 11:08 ` Pavel Machek
[not found] ` <20050322110802.GA1751-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-22 17:24 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503221216430.954-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-23 23:49 ` Benjamin Herrenschmidt
2005-03-23 18:32 ` David Brownell
[not found] ` <200503231032.36164.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 21:00 ` Pavel Machek
2005-03-22 4:21 ` Benjamin Herrenschmidt
2005-03-22 17:04 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503221143460.954-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-22 23:36 ` Benjamin Herrenschmidt
2005-03-23 1:17 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503221709080.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-23 19:02 ` David Brownell
[not found] ` <200503231102.27137.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 20:36 ` Nigel Cunningham
2005-03-23 21:08 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503231544550.631-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-24 2:35 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231827310.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 17:03 ` Alan Stern
[not found] ` <Pine.LNX.4.44L0.0503241149000.1345-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-24 17:13 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240904570.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 17:46 ` David Brownell
2005-03-24 17:51 ` Patrick Mochel
2005-03-24 19:27 ` Alan Stern
2005-03-23 18:58 ` David Brownell
[not found] ` <200503231058.54311.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 19:37 ` Jordan Crouse
[not found] ` <20050323123725.201d8a67-aftB2sG12IhaqnLngUycEA@public.gmane.org>
2005-03-24 5:16 ` David Brownell
2005-03-23 23:24 ` Benjamin Herrenschmidt
2005-03-24 2:45 ` David Brownell
[not found] ` <200503231845.55392.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-24 5:03 ` Benjamin Herrenschmidt
2005-03-24 5:27 ` David Brownell
[not found] ` <200503232127.19576.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-24 6:02 ` Benjamin Herrenschmidt
2005-03-24 6:31 ` David Brownell
[not found] ` <200503232231.00561.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-24 6:36 ` Benjamin Herrenschmidt
2005-03-24 7:46 ` David Brownell
2005-03-23 0:52 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503221635130.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-23 1:21 ` Benjamin Herrenschmidt
2005-03-23 1:46 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503221724550.16154-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-23 3:31 ` Benjamin Herrenschmidt
2005-03-23 18:20 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231008340.17099-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-23 21:02 ` Pavel Machek
[not found] ` <20050323210204.GE30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-23 21:35 ` Nigel Cunningham
[not found] ` <1111613750.14853.117.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
2005-03-23 21:54 ` Pavel Machek
[not found] ` <20050323215416.GK30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-24 2:40 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231838570.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 3:16 ` Nigel Cunningham
[not found] ` <1111634182.3430.1.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
2005-03-24 8:19 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240017460.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 10:01 ` CPU local things [was Re: Nested suspends; messages vs. states] Pavel Machek
[not found] ` <20050324100153.GE1354-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-24 15:59 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240749030.24692-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 17:14 ` Nathan Lynch
2005-03-24 20:59 ` Nigel Cunningham
2005-03-23 23:14 ` Nested suspends; messages vs. states Benjamin Herrenschmidt
2005-03-24 1:27 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231724100.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 9:59 ` Pavel Machek
[not found] ` <20050324095910.GD1354-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-24 15:48 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240746290.24692-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 16:38 ` David Brownell
[not found] ` <200503240838.37628.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-24 17:00 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240858200.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 17:33 ` David Brownell
[not found] ` <200503240933.49123.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-24 17:41 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503240937150.13683-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 18:08 ` David Brownell
2005-03-24 1:41 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231727220.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 2:22 ` Benjamin Herrenschmidt
2005-03-24 2:05 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231742090.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 2:29 ` Benjamin Herrenschmidt
2005-03-24 5:02 ` David Brownell
[not found] ` <200503232102.51132.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-24 5:14 ` Benjamin Herrenschmidt
2005-03-24 5:31 ` David Brownell
2005-03-24 8:16 ` Patrick Mochel
2005-03-23 19:06 ` David Brownell
[not found] ` <200503231106.03160.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 20:29 ` Nigel Cunningham
[not found] ` <1111609769.14853.104.camel-r49W/1Cwd2ff0s6lnCXPX/uOuaPYTxhvJwvTLr3MMZM@public.gmane.org>
2005-03-23 20:55 ` David Brownell
2005-03-23 21:18 ` Alan Stern
2005-03-24 2:13 ` Patrick Mochel
[not found] ` <Pine.LNX.4.50.0503231810400.15119-100000-x8k/2hhmB0w5etPau2IXcQ@public.gmane.org>
2005-03-24 2:52 ` David Brownell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox