public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* comments on irc log
@ 2005-03-18  2:32 Benjamin Herrenschmidt
  2005-03-18 16:56 ` Alan Stern
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-18  2:32 UTC (permalink / raw)
  To: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 14926 bytes --]

Hi Folks !

Sorry, I couldn't make it for a 2am meeting, and I suspect I had way too
much Guiness in my system to be useful at midnight anyway yesterday :)

I've browsed the IRC log and have a few notes/comments/replies:

21:38:01< pavelm> At one point someone at intel was looking onto s-t-ram on smp machine...
21:38:13< pavelm> ...is he/she still working on that?
21:38:51< pavelm> Airplane-like machine was toshiba laptop; I did nnot open it.
21:39:24< pavelm> DC=dual core... aha, I parsed it wrong.

Pavel: Paulus has that working on an SMP PowerMac. The simplest/safest way to do that
is to implement some kind of hotplug CPU (even if the CPU isn't physically turned off,
just "park" it in some kind of sleep loop or so), and only trigger the system-wide STR
after you have stopped all CPUs but one. He left usrland the responsibility to do that.

Of course, it also depends how the wakeup works on SMP systems, I suspect it's fairly
platform specific. On those macs, all CPUs come up in ROM, and the ROM keeps all but
one in a sleep loop, like on boot, and we get them back with a soft-reset, like on boot.

21:43:34< nigel> Luming: I was meaning one where the chip itself gets completely powered down and needs a complete reconfigure on wake.
21:43:39< pavelm> ;-< well, when BIOS at least posts the card, things are easy.

Note that I have some code for POST'ing some radeon's that might be adapt-able. The only
"issue" is I don't know how to extract from the x86 BIOS ROM the proper sequence of values
for the SDRAM mode register (SDRAM chip init). This is write-only obviously so I can't just
read the values before sleep and POST the chip with those like I do for the rest of the
chip. I know values for Mac laptops, not x86.

21:52:11< pavelm> I was playing with variable scheduling ticks here, hoping to save some power.
21:52:31< pavelm> How big power savings should I expect?
21:52:48< pavelm> What cpu will benefit most?
21:53:04< pavelm> Is there easy way to measure it?

I played with that too on some PPCs and was surprised by the absence of benefit, but I might
have done something wrong, I need to instrument the stuff better.



23:26:39< nigel> I wonder if we should just have an enter_state() call.
23:26:59< mochel> nigel: that's been suggested before 
23:27:00< db> nigel:  enter_state(state) would then be suspend(state)???
23:27:06< mochel> yes

I have this crazy idea that we could have a single "new" enter_state(), and keep
suspend/resume for system state transitions.

Basically, my idea there is that enter_state() is the actual low level driver
state change function. It is called when userland picks a state in sysfs, or
we could deal with the various bus state dependencies if we want etc...

We could keep suspend/resume separate for the system-wide suspend, and have
them implement the policy of converting a system wide suspend/resume into the
appropriate enter_state() for the driver.

"Old" or "Simple" drivers would just suspend/resume and not implement
enter_state, more complex/subtle drivers would do the above.

I haven't quite thought out the implications, it's just an idea that came to mind
as I was reading the log...
 
23:27:23< alan> nigel: The PM core still needs to tell suspends and resumes apart.
23:27:38< nigel> You mean system states, or run time?
23:27:41 * lenb returns
23:27:42< alan> Whether they use separate callbacks is an implementation detail.
23:27:47< nigel> I'm thinking of both.
23:27:56< mochel> the core only needs system states 
23:28:05< jcrouse> right
23:28:22< db> mochel: if core only needs S0/S1/S2... [ sticking to ACPI model for the moment]
23:28:25< mochel> it then should tell the drivers that they need to enter a state compatible w/ that system state
23:28:27< nigel> True. But the state a driver is in is affected by both.

I think system states, driver/device states and bus states are 3 different things.

I would say a scenario is:

 - Driver picks a device state based on a system state
 - Bus states updates based on child device states
 - Bus state might be force-able in which case it triggers child device states  changes

Bus states could be bus-type specific. Drivers could represent an array of states with names
as I proposed, with a dependency to bus states explained as bit masks (optionally maybe a
function to resolve dependencies for drivers that have special constraints ?)

Policy of what device state to chose for system state is driver specific, could be done the
way I exposed above with my idea of separating suspend/resume from enter_state.

Again, just crazy ideas coming to mind as I read. I'm still recovering from St Patrick's
night :)

23:28:38< alan> The core needs to know about system states, but sysfs needs to understand runtime states a little.
23:28:54< mochel> sysfs is not really an issue
23:28:56< db> mochel: then we need separate layers for driver-specific states D0/D1/D2/... yes?
23:29:03< mochel> db: yes
23:29:17< mochel> we can and should export each bus-specific range of states through sysfs
23:29:25< mochel> (through the devices directories) 
23:29:38< db> So a given system will have (a) system states, (b) driver states, often bus-specific ...

Heh, funny, close to what I wrote :) Yes, I think we need to separate those, and I would
even split bus states & driver states. Drivers can have plenty of local states that aren't
bus specific (they can have a rich set of local PM states I mean) and the bus would
eventually only create a dependency to some of those stats.

BTW. David, can't your clock stuff be simply represented in terms of bus & device states as
well ? In most case, it's not PCI, so it could be defined as special bus types with states
matching the various clock states.
 
23:29:40< mochel> i.e. each bus type should export states for that device
23:29:53< db> ... all using the same pointer type
23:29:58< mochel> then we need a userspace utility that distinguishes between them
23:30:15< nigel> Don't you then end up with some ugly mess inside drivers where you figure out what to do for different combinations of  runtime and system states?
23:30:28< mochel> nigel: yes, but better there than in the core
23:30:34< mochel> we can provide helpers in the buses
23:30:44< mochel> because most devices will be the same 
23:31:01 * mochel will be back in 2 minutes 
23:31:10< jcrouse> If the bus handled the system state -> device state translation, then it makes the drivers much easier

I think the driver should choose, but we could provide "defaults" for drivers who don't
want to bother.

23:31:11< db> so that makes a third way to use states:  transform them
23:31:18< nigel> Hmmm... I suppose you can't avoid that... ok.
23:31:20< bernard> Also, should transitions be restricted to only to/from the "on" state (whatever that may be, eg D0)? Or should there be some wrapper code for suspend/enter_state that first puts the device back into D0, then suspend?

No, system states are, device states are more flexible. Devices and busses can have specific restrictions
(like PCI spec mandates a transition to D0 iirc when coming from a deeper state) but that is to be
handled locally, either at the bus or device level.

23:31:42< db> not all busses are as regular as PCI or USB, note ... platform devices on embedded hardware, e.g.
23:32:15< db> bernard:  gaack.  please, no arbitrary restrictions.  busses may have some though.
23:32:21< nigel> I'm not sure about enforcing going to full power first.
23:32:34< alan> Wrapper code is up to the driver.

Agreed.

23:33:02< db> nigel: I'm sure about NOT enforcing it.  E.g. for PCI, unless driver needs it, D1 to D2 is fine.
23:33:09< nigel> :>
23:33:31< nigel> I was expressing my reservations gently :>
23:33:32< bernard> db: okay. Pavel's pm_message_t insists that transitions are only ever made into or out of D0. (I guess because it makes life easier for existing drivers' PM code).
23:33:59< bernard> rather, where pm_message_t was headed
23:34:13< alan> Doesn't pm_message_t allow FREEZE -> SUSPEND?

No. Once frozen, you can't talk to your devices, you parent (bus) may be frozen too, so you
can't send the necessary commands to your device to suspend it. FREEZE and SUSPEND in the
current simple model share the fact that once you are frozen, no activity can take place
since your parent (bus) can be frozen too preventing communication with the device. 

23:34:52< nigel> We wouldn't use it if it did.
23:34:56< db> bernard: I think that "revert-to-D0" style rule came from Patrick's original driver model stuff; ask him

It is a PCI requirement iirc, no ?

23:35:01< nigel> Freeze is only used during the atomic copy.

Well, I'm toying with the idea of extending freeze to kexec 

23:35:10< bernard> quoting Documentation/power/devices.txt (in Pavel's tree) "Transitions are only from a resumed state to a suspended state, never between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, FREEZE -> SUSPEND or SUSPEND -> FREEZE can not)."
23:35:13< nigel> After that we want devices on for writing the atomic copy.
23:35:24-!- pavelma [~pavel-jyMamyUUXNJG4ohzP4jBZS1Fcj925eT/@public.gmane.org] has joined #pm
23:35:33< alan> This is a separate question not mentioned in my email.
23:35:45< alan> In principle the image can be written without waking up every device.

Yes, partial tree suspend. It was decided that we would bother about it when we have the stuff
working well enough as it is though :) If we start going to device local states, sysfs originated
transitions, etc, though, we'll probably end up with a mecanism capable of that. That is
triggering a wake of the storage device which will "cascade" upward along the tree.

23:35:47< pavelma> Sorry, poor signal inside, rain outside.
23:35:59< alan> Hi Pavel!
23:36:03 * db greets pavel ... no horses today?
23:36:28-!- bernard changed the topic of #pm to: linux-pm discussion. Logged live at http://helicon.ucs.uwa.edu.au/~bernard/irc/%23pm.log
23:36:36< pavelma> horses will have to wait for my battery running out.
23:37:13< db> We don't seem to be progressing very far on states.
23:37:19< alan> Time for next topic?...
23:37:29< mochel> yes, 
23:37:31< db> We have a working agreement (I think) that they're pointer-to-struct,
23:37:39< alan> Can complex states be given names simple enough to use with sysfs?
23:37:43< mochel> but first can we have some administritiva? 
23:37:56< alan> mochel: go ahead.
23:37:58< db> maybe with name embedded, and are used in suspend() calls, sysfs, and bus mapping glue.
23:38:01< pavelma> alan: read faq in swsusp.txt for why partial resume in swsusp is bad idea.

Note that entering FREEZE and exiting it is very fast. Currently, we suspend everything tho, but
once drivers start knowing the difference, it will be as there is no HW PM to be done.


23:45:26< mochel> alan: is it right to assume that we agree that system power states are generic,  but need to be translated to bus-specific states?
23:45:35< alan> mochel: Exactly.

Agreed, though I would be richer and define device states. In fact, device states and bus states could
be the same thing if you consider the bus states as beeing the device state of the bus controller
device though.

But my point is that individual drivers want to expose richer states than the normal
bus states, they may have locally several PM modes with various performances for
example that they want to expose in sysfs. I think we should cover device states
and maybe just have bus states just be device states of the bus controllers, and
deal with cascading dependencies.

Heh, strange, it sounds like what I wrote a while ago :)

So well defined busses like PCI or USB would have a strict definition of the
bus states, taht is the pci_bus driver states (yes, we are getting pci bus drivers,
we need those anyway and I've seen separate work toward this) and the {e,o,u}hci
states.

23:45:51< pavelma> alan: I lost about half of conversation...
23:45:57< mochel> and the best way to do that is to encapsulate them in bus-specific type 
23:45:59< mochel> ?

Hrm...

23:46:03< db> Or driver-specific ones.  Repeat:  not everything's as regular as pci or usb 

Agreed.

23:46:04< nigel> What about those funny states that one platform had? I forgot the names now.
23:46:25< mochel> jcrouse: heh, they're in everyone's mind. we've a lot to do in 4 months :) 
23:46:34< jcrouse> no doubt
23:46:38< alan> pavelma: You can catch up later on Bernard's feed.
23:46:40< lenb> alan: on ACPI-enabled systems, for motherboard devices, the BIOS provides a mapping between system and device states -- though Linux doesn't look at it yet.

That's fine. If we split and decide that, for example, suspend/resume are responsible
for this mapping, drivers are welcome to just call ACPI to get it. Or we could have
the core do the mapping if the driver doesn't provide a mapping function, and the core
could call ACPI on machines where it exist. But I want the driver to have the possibility
of beeing in control, to decide either not to use ACPI or override it's decision.

23:46:40< db> nigel:  most non-pc platforms don't support pc platform states...
23:46:53< nigel> Systems states I mean.
23:47:10< mochel> So, we pass system state to drivers, which then must translate it to an appropriate device state for the given system state.

Agreed.

23:47:14< db> s/platform state/syste state/
23:47:18< alan> Bus and device drivers should strive to use ACPI mappings when available.

We should only define bus states (or driver states for bus controllers, see above).

Wether we define them the same way ACPI does is a matter of how good ACPI definition
is, I haven't seen it, but it should probably be discussed bus per bus.

23:47:29< mochel> s/available/appropriate/
23:47:41< mochel> ACPI is not *always* right :) 
23:47:48-!- pavelm [~pavel-jyMamyUUXNJG4ohzP4jBZS1Fcj925eT/@public.gmane.org] has quit [Remote host closed the connection]
23:47:53< alan> Available _and_ appropriate.
23:48:08< mochel> i figured the latter assumed the former..
23:48:22< db> acpi states were supposed to be considerd in pci_choose_state() for example
23:48:30< lenb> Yes, I'm okay with Linux being able to over-ride what the BIOS tells it, but no reason to invent a new language and mapping if ACPI gives us one already on many systems.
23:48:45< mochel> lenb: definitely

Provided the mapping provided by ACPI is sane ;) but again, I haven't seen it. I don't feel like groking
the whole of ACPI spec, so it would be nice if you could do a short abstract of it for us...
  
23:48:51< alan> Is there an easy way to get the mapping from ACPI?
23:48:55< mochel> ditto for other firmwares
23:49:00< db> On systems that support ACPI, how will we know to ignore its mappings?

Drivers have override function, platform may override too.


23:52:21< db> mochel:  I'm thinking about embedded hardware, like ARM, that will never touch ACPI.   Ever.

Or pmac, I hope :)

Ok, enough for now...

Ben.



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
  2005-03-18  2:32 comments on irc log Benjamin Herrenschmidt
@ 2005-03-18 16:56 ` Alan Stern
       [not found]   ` <Pine.LNX.4.44L0.0503181147110.1099-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  2005-03-18 18:13 ` Pavel Machek
  2005-03-23 19:46 ` David Brownell
  2 siblings, 1 reply; 24+ messages in thread
From: Alan Stern @ 2005-03-18 16:56 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2071 bytes --]

On Fri, 18 Mar 2005, Benjamin Herrenschmidt wrote:

> I have this crazy idea that we could have a single "new" enter_state(), and keep
> suspend/resume for system state transitions.
> 
> Basically, my idea there is that enter_state() is the actual low level driver
> state change function. It is called when userland picks a state in sysfs, or
> we could deal with the various bus state dependencies if we want etc...
> 
> We could keep suspend/resume separate for the system-wide suspend, and have
> them implement the policy of converting a system wide suspend/resume into the
> appropriate enter_state() for the driver.
> 
> "Old" or "Simple" drivers would just suspend/resume and not implement
> enter_state, more complex/subtle drivers would do the above.
> 
> I haven't quite thought out the implications, it's just an idea that came to mind
> as I was reading the log...

This sounds like a reasonable thing to do.  We do need distinct ways to
tell drivers "Go to this system state" and "Go to this bus/device state".  
Whether they are implemented by 1, 2, or 3 different callbacks doesn't
really matter (except that we might want to minimize the number of 
function pointers stored in the driver structures).

> Yes, partial tree suspend. It was decided that we would bother about it when we have the stuff
> working well enough as it is though :) If we start going to device local states, sysfs originated
> transitions, etc, though, we'll probably end up with a mecanism capable of that. That is
> triggering a wake of the storage device which will "cascade" upward along the tree.

Absolutely; something like that is needed for runtime resume-on-demand.

> Note that entering FREEZE and exiting it is very fast. Currently, we suspend everything tho, but
> once drivers start knowing the difference, it will be as there is no HW PM to be done.

Usually very fast.  Unfortunately for some kinds of USB host controllers 
it can be relatively slow, since quiescing the controller has an 
unavoidable side effect of suspending all devices on the bus.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
  2005-03-18  2:32 comments on irc log Benjamin Herrenschmidt
  2005-03-18 16:56 ` Alan Stern
@ 2005-03-18 18:13 ` Pavel Machek
       [not found]   ` <20050318181317.GD18427-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  2005-03-23 19:46 ` David Brownell
  2 siblings, 1 reply; 24+ messages in thread
From: Pavel Machek @ 2005-03-18 18:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 2575 bytes --]

Hi!

> I've browsed the IRC log and have a few notes/comments/replies:
> 
> 21:38:01< pavelm> At one point someone at intel was looking onto s-t-ram on smp machine...
> 21:38:13< pavelm> ...is he/she still working on that?
> 21:38:51< pavelm> Airplane-like machine was toshiba laptop; I did nnot open it.
> 21:39:24< pavelm> DC=dual core... aha, I parsed it wrong.
> 
> Pavel: Paulus has that working on an SMP PowerMac. The simplest/safest way to do that
> is to implement some kind of hotplug CPU (even if the CPU isn't physically turned off,
> just "park" it in some kind of sleep loop or so), and only trigger the system-wide STR
> after you have stopped all CPUs but one. He left usrland the
> responsibility to do that.

I do not want to leave that to userland, because we need it during
swsusp resume, too, and userland is not available at that
point. Otherwise I agree (and have some code from Li that implements
it that way).

> 21:43:34< nigel> Luming: I was meaning one where the chip itself gets completely powered down and needs a complete reconfigure on wake.
> 21:43:39< pavelm> ;-< well, when BIOS at least posts the card, things are easy.
> 
> Note that I have some code for POST'ing some radeon's that might be adapt-able. The only
> "issue" is I don't know how to extract from the x86 BIOS ROM the proper sequence of values
> for the SDRAM mode register (SDRAM chip init). This is write-only obviously so I can't just
> read the values before sleep and POST the chip with those like I do for the rest of the
> chip. I know values for Mac laptops, not x86.

So it is basically "need very simple piece of documentation from
notebook vendor" and some vendors are already wiling to share that
info? Good.

> 21:52:11< pavelm> I was playing with variable scheduling ticks here, hoping to save some power.
> 21:52:31< pavelm> How big power savings should I expect?
> 21:52:48< pavelm> What cpu will benefit most?
> 21:53:04< pavelm> Is there easy way to measure it?
> 
> I played with that too on some PPCs and was surprised by the absence of benefit, but I might
> have done something wrong, I need to instrument the stuff better.

Difference between HZ=100 and HZ=1000 was measuerd to be approx. as
big as disk spinnned up vs. spinned down (i.e. watt or so) by
seife.... Not *that* unimportant. But it took month or so to get that
data, because he was basically measuring runtime from full to empty
battery.

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]   ` <Pine.LNX.4.44L0.0503181147110.1099-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
@ 2005-03-18 18:14     ` Pavel Machek
  2005-03-18 23:07     ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 24+ messages in thread
From: Pavel Machek @ 2005-03-18 18:14 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1458 bytes --]

Hi!

> > I have this crazy idea that we could have a single "new" enter_state(), and keep
> > suspend/resume for system state transitions.
> > 
> > Basically, my idea there is that enter_state() is the actual low level driver
> > state change function. It is called when userland picks a state in sysfs, or
> > we could deal with the various bus state dependencies if we want etc...
> > 
> > We could keep suspend/resume separate for the system-wide suspend, and have
> > them implement the policy of converting a system wide suspend/resume into the
> > appropriate enter_state() for the driver.
> > 
> > "Old" or "Simple" drivers would just suspend/resume and not implement
> > enter_state, more complex/subtle drivers would do the above.
> > 
> > I haven't quite thought out the implications, it's just an idea that came to mind
> > as I was reading the log...
> 
> This sounds like a reasonable thing to do.  We do need distinct ways to
> tell drivers "Go to this system state" and "Go to this bus/device state".  
> Whether they are implemented by 1, 2, or 3 different callbacks doesn't
> really matter (except that we might want to minimize the number of 
> function pointers stored in the driver structures).

Actually you want to re-use existing suspend-to-ram code in drivers as
much as possible.

								Pavel

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]   ` <Pine.LNX.4.44L0.0503181147110.1099-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
  2005-03-18 18:14     ` Pavel Machek
@ 2005-03-18 23:07     ` Benjamin Herrenschmidt
  2005-03-18 23:18       ` Pavel Machek
  1 sibling, 1 reply; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-18 23:07 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 764 bytes --]


> Usually very fast.  Unfortunately for some kinds of USB host controllers 
> it can be relatively slow, since quiescing the controller has an 
> unavoidable side effect of suspending all devices on the bus.

I wonder if we need to quiesce the controller in fact for FREEZE.
Probably not. Just stop all queue processing and refuse URBs. The hcca
will still get updated, but who cares ? it will end up beeing saved in
an inconsistent state in the suspend image, so what ? On resume, we will
have rebooted, we can "clean it up".

This is sort-of breaking the rule of "no DMA", and thus is not suitable
for kexec (which is ok, kexec currently uses the separate "shutdown"
callback which must switch DMA off), but would fix the problem for
suspend to disk...

Ben.



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]   ` <20050318181317.GD18427-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-18 23:15     ` Benjamin Herrenschmidt
  2005-03-21 20:06     ` Jordan Crouse
  1 sibling, 0 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-18 23:15 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1409 bytes --]

On Fri, 2005-03-18 at 19:13 +0100, Pavel Machek wrote:

> 
> So it is basically "need very simple piece of documentation from
> notebook vendor" and some vendors are already wiling to share that
> info? Good.

Yes. At this point, I can wake up rv350 and rv280 mobilities with some
limitations (maybe not all panels, I don't get the DVI output right
yet, ...). The "piece of doc" might even be obtained from ATI there
since it's possible that this infos is in a standard place in the BIOS
image (like other tables already there).

> > 21:52:11< pavelm> I was playing with variable scheduling ticks here, hoping to save some power.
> > 21:52:31< pavelm> How big power savings should I expect?
> > 21:52:48< pavelm> What cpu will benefit most?
> > 21:53:04< pavelm> Is there easy way to measure it?
> > 
> > I played with that too on some PPCs and was surprised by the absence of benefit, but I might
> > have done something wrong, I need to instrument the stuff better.
> 
> Difference between HZ=100 and HZ=1000 was measuerd to be approx. as
> big as disk spinnned up vs. spinned down (i.e. watt or so) by
> seife.... Not *that* unimportant. But it took month or so to get that
> data, because he was basically measuring runtime from full to empty

Yes, it was also on some ppc's which is why I'm wondering if I did
something wrong in my experiment :) It may also be different for
different CPU models.

Ben



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
  2005-03-18 23:07     ` Benjamin Herrenschmidt
@ 2005-03-18 23:18       ` Pavel Machek
       [not found]         ` <20050318231801.GE24449-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Pavel Machek @ 2005-03-18 23:18 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1139 bytes --]

Hi!

> > Usually very fast.  Unfortunately for some kinds of USB host controllers 
> > it can be relatively slow, since quiescing the controller has an 
> > unavoidable side effect of suspending all devices on the bus.
> 
> I wonder if we need to quiesce the controller in fact for FREEZE.
> Probably not. Just stop all queue processing and refuse URBs. The hcca
> will still get updated, but who cares ? it will end up beeing saved in
> an inconsistent state in the suspend image, so what ? On resume, we will
> have rebooted, we can "clean it up".
> 
> This is sort-of breaking the rule of "no DMA", and thus is not suitable
> for kexec (which is ok, kexec currently uses the separate "shutdown"
> callback which must switch DMA off), but would fix the problem for
> suspend to disk...

What problem? suspend seems to +/- work with suspend-to-disk just
now. I'd really hate to have to think about "some memory may change
behind my back" during suspend. I think "no DMA" is a good rule.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]         ` <20050318231801.GE24449-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-19  1:21           ` Benjamin Herrenschmidt
  2005-03-19  3:23             ` Alan Stern
  2005-03-19 10:32             ` Pavel Machek
  0 siblings, 2 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-19  1:21 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1844 bytes --]

On Sat, 2005-03-19 at 00:18 +0100, Pavel Machek wrote:
> Hi!
> 
> > > Usually very fast.  Unfortunately for some kinds of USB host controllers 
> > > it can be relatively slow, since quiescing the controller has an 
> > > unavoidable side effect of suspending all devices on the bus.
> > 
> > I wonder if we need to quiesce the controller in fact for FREEZE.
> > Probably not. Just stop all queue processing and refuse URBs. The hcca
> > will still get updated, but who cares ? it will end up beeing saved in
> > an inconsistent state in the suspend image, so what ? On resume, we will
> > have rebooted, we can "clean it up".
> > 
> > This is sort-of breaking the rule of "no DMA", and thus is not suitable
> > for kexec (which is ok, kexec currently uses the separate "shutdown"
> > callback which must switch DMA off), but would fix the problem for
> > suspend to disk...
> 
> What problem? suspend seems to +/- work with suspend-to-disk just
> now. I'd really hate to have to think about "some memory may change
> behind my back" during suspend. I think "no DMA" is a good rule.

Well, it's not that simple. It may work for you and not for others, and
it will definitely introduce complications with the current scheme since
it seems we +/- have to suspend USB busses (and possibly disconnect some
devices) at freeze time...

Note that this will become a non-issue when instead of waking everybody
up, we just wake the devices on the disk path, we can do the real
suspend initially for the others.

For now, it might be interesting to not shut down the OHCI (not sure
about E/UHCI's tho) as just letting it touch the HCCA isn't an issue
(only for the freeze before snapshot tho, not when getting rid of the
loader kernel, but in this case, what we do is more like kexec and we
might prefer using shutdown callbacks to that effect).

Ben.



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
  2005-03-19  1:21           ` Benjamin Herrenschmidt
@ 2005-03-19  3:23             ` Alan Stern
       [not found]               ` <Pine.LNX.4.44L0.0503182205040.30560-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
  2005-03-19 10:32             ` Pavel Machek
  1 sibling, 1 reply; 24+ messages in thread
From: Alan Stern @ 2005-03-19  3:23 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2677 bytes --]

On Sat, 19 Mar 2005, Benjamin Herrenschmidt wrote:

> > > I wonder if we need to quiesce the controller in fact for FREEZE.
> > > Probably not. Just stop all queue processing and refuse URBs. The hcca
> > > will still get updated, but who cares ? it will end up beeing saved in
> > > an inconsistent state in the suspend image, so what ? On resume, we will
> > > have rebooted, we can "clean it up".
> > > 
> > > This is sort-of breaking the rule of "no DMA", and thus is not suitable
> > > for kexec (which is ok, kexec currently uses the separate "shutdown"
> > > callback which must switch DMA off), but would fix the problem for
> > > suspend to disk...
> > 
> > What problem? suspend seems to +/- work with suspend-to-disk just
> > now. I'd really hate to have to think about "some memory may change
> > behind my back" during suspend. I think "no DMA" is a good rule.
> 
> Well, it's not that simple. It may work for you and not for others, and
> it will definitely introduce complications with the current scheme since
> it seems we +/- have to suspend USB busses (and possibly disconnect some
> devices) at freeze time...

It may not be as bad as I made it sound...  The time required is probably 
on the order of 20 - 30 ms per bus for freeze/suspend + resume.

> Note that this will become a non-issue when instead of waking everybody
> up, we just wake the devices on the disk path, we can do the real
> suspend initially for the others.
> 
> For now, it might be interesting to not shut down the OHCI (not sure
> about E/UHCI's tho) as just letting it touch the HCCA isn't an issue
> (only for the freeze before snapshot tho, not when getting rid of the
> loader kernel, but in this case, what we do is more like kexec and we
> might prefer using shutdown callbacks to that effect).

If I understand correctly EHCI is no problem at all, since the I/O queues 
can be turned off independently from the controller itself.

But there might be a different problem.  If a USB driver is not modular,
then what happens when the swsusp boot kernel loads the memory image?  It
FREEZEs all the devices first, right?  So that the image kernel gets
control back with the devices in a known good state.  If you allowed a
device to continue doing DMA during this FREEZE (even if it's just OUT
transfers, not writing to memory), it could end up reading arbitrary
invalid data which would cause a device error.

This problem would be eliminated if the boot kernel's driver was aware
that a resume-from-disk was in progress.  I've heard that suspend2 can
offer such a feature; can it be added to swsusp?  Using a special flags 
value for the FREEZE would be sufficient.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
  2005-03-19  1:21           ` Benjamin Herrenschmidt
  2005-03-19  3:23             ` Alan Stern
@ 2005-03-19 10:32             ` Pavel Machek
  1 sibling, 0 replies; 24+ messages in thread
From: Pavel Machek @ 2005-03-19 10:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1043 bytes --]

Hi!

> > > This is sort-of breaking the rule of "no DMA", and thus is not suitable
> > > for kexec (which is ok, kexec currently uses the separate "shutdown"
> > > callback which must switch DMA off), but would fix the problem for
> > > suspend to disk...
> > 
> > What problem? suspend seems to +/- work with suspend-to-disk just
> > now. I'd really hate to have to think about "some memory may change
> > behind my back" during suspend. I think "no DMA" is a good rule.
> 
> Well, it's not that simple. It may work for you and not for others, and
> it will definitely introduce complications with the current scheme since
> it seems we +/- have to suspend USB busses (and possibly disconnect some
> devices) at freeze time...

Okay, I see that can be slow. Is there any other problem?

...and I probably would not want to suspend to my external usb2
disk... Ok, that might be small problem.
							Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]               ` <Pine.LNX.4.44L0.0503182205040.30560-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
@ 2005-03-19 10:33                 ` Pavel Machek
       [not found]                   ` <20050319103351.GM24449-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  2005-03-19 12:02                 ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 24+ messages in thread
From: Pavel Machek @ 2005-03-19 10:33 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 512 bytes --]

Hi!

> This problem would be eliminated if the boot kernel's driver was aware
> that a resume-from-disk was in progress.  I've heard that suspend2 can
> offer such a feature; can it be added to swsusp?  Using a special flags 
> value for the FREEZE would be sufficient.

Flag certainly can be added, but it seems to me solution is to stop
the DMA in this case.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]               ` <Pine.LNX.4.44L0.0503182205040.30560-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
  2005-03-19 10:33                 ` Pavel Machek
@ 2005-03-19 12:02                 ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2005-03-19 12:02 UTC (permalink / raw)
  To: Alan Stern; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]

On Fri, 2005-03-18 at 22:23 -0500, Alan Stern wrote:

> But there might be a different problem.  If a USB driver is not modular,
> then what happens when the swsusp boot kernel loads the memory image?  It
> FREEZEs all the devices first, right?  So that the image kernel gets
> control back with the devices in a known good state.  If you allowed a
> device to continue doing DMA during this FREEZE (even if it's just OUT
> transfers, not writing to memory), it could end up reading arbitrary
> invalid data which would cause a device error.
> 
> This problem would be eliminated if the boot kernel's driver was aware
> that a resume-from-disk was in progress.  I've heard that suspend2 can
> offer such a feature; can it be added to swsusp?  Using a special flags 
> value for the FREEZE would be sufficient.

Oh, I was not talking about letting devices to DMA etc... in general. I
was only talking about the specific case of OHCI not suspending the bus.
All queue processing would still be stopped, and URBs refused etc.., the
only DMA happening there would be the updating of the HCCA which I think
can be managed, but then, again, this is just some optimisation that can
be done later once we have a working & stable setup.

Ben.



[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]                   ` <20050319103351.GM24449-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-19 15:49                     ` Alan Stern
  0 siblings, 0 replies; 24+ messages in thread
From: Alan Stern @ 2005-03-19 15:49 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1402 bytes --]

On Sat, 19 Mar 2005, Pavel Machek wrote:

> Hi!
> 
> > This problem would be eliminated if the boot kernel's driver was aware
> > that a resume-from-disk was in progress.  I've heard that suspend2 can
> > offer such a feature; can it be added to swsusp?  Using a special flags 
> > value for the FREEZE would be sufficient.
> 
> Flag certainly can be added, but it seems to me solution is to stop
> the DMA in this case.

On Sat, 19 Mar 2005, Benjamin Herrenschmidt wrote:

> Oh, I was not talking about letting devices to DMA etc... in general. I
> was only talking about the specific case of OHCI not suspending the bus.
> All queue processing would still be stopped, and URBs refused etc.., the
> only DMA happening there would be the updating of the HCCA which I think
> can be managed, but then, again, this is just some optimisation that can
> be done later once we have a working & stable setup.

Certainly the easiest solution for now is always to stop DMA and just live 
with the fact that for UHCI, FREEZE->ON will be a little slow.

Ultimately, if uhci-hcd can tell apart the two types of FREEZE (preparing
to create snaphot vs. preparing to restore snapshot) then it could know to
leave DMA on for the first but not the second.  I agree it's too early to 
implement this -- a minimum requirement is that CONFIG_USB_SUSPEND should 
always be true and not a configurable option.

Alan Stern


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]       ` <20050321130612.135d726e-aftB2sG12IhaqnLngUycEA@public.gmane.org>
@ 2005-03-21 20:03         ` Pavel Machek
  0 siblings, 0 replies; 24+ messages in thread
From: Pavel Machek @ 2005-03-21 20:03 UTC (permalink / raw)
  To: Jordan Crouse; +Cc: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]

Hi!

> > > 21:52:11< pavelm> I was playing with variable scheduling ticks here,
> > > hoping to save some power. 21:52:31< pavelm> How big power savings
> > > should I expect? 21:52:48< pavelm> What cpu will benefit most?
> > > 21:53:04< pavelm> Is there easy way to measure it?
> > > 
> > > I played with that too on some PPCs and was surprised by the absence
> > > of benefit, but I might have done something wrong, I need to
> > > instrument the stuff better.
> > 
> > Difference between HZ=100 and HZ=1000 was measuerd to be approx. as
> > big as disk spinnned up vs. spinned down (i.e. watt or so) by
> > seife.... Not *that* unimportant. But it took month or so to get that
> > data, because he was basically measuring runtime from full to empty
> > battery.
> 
> Have any of these patches been published?  I've been playing with
> variable ticks too, and I would like to see what other people have been
> doing (as so far, the only thing I've managed to do is commit crimes
> against nature and the idle loop).

Try searching lkml archives for CONFIG_NO_IDLE_HZ.... There are
actually two different codebases floating around.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]   ` <20050318181317.GD18427-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  2005-03-18 23:15     ` Benjamin Herrenschmidt
@ 2005-03-21 20:06     ` Jordan Crouse
       [not found]       ` <20050321130612.135d726e-aftB2sG12IhaqnLngUycEA@public.gmane.org>
  1 sibling, 1 reply; 24+ messages in thread
From: Jordan Crouse @ 2005-03-21 20:06 UTC (permalink / raw)
  To: Linux-pm mailing list

[-- Attachment #1: Type: text/plain, Size: 1200 bytes --]

On Fri, 18 Mar 2005 19:13:17 +0100
"Pavel Machek" <pavel-+ZI9xUNit7I@public.gmane.org> wrote:

> > 21:52:11< pavelm> I was playing with variable scheduling ticks here,
> > hoping to save some power. 21:52:31< pavelm> How big power savings
> > should I expect? 21:52:48< pavelm> What cpu will benefit most?
> > 21:53:04< pavelm> Is there easy way to measure it?
> > 
> > I played with that too on some PPCs and was surprised by the absence
> > of benefit, but I might have done something wrong, I need to
> > instrument the stuff better.
> 
> Difference between HZ=100 and HZ=1000 was measuerd to be approx. as
> big as disk spinnned up vs. spinned down (i.e. watt or so) by
> seife.... Not *that* unimportant. But it took month or so to get that
> data, because he was basically measuring runtime from full to empty
> battery.

Have any of these patches been published?  I've been playing with
variable ticks too, and I would like to see what other people have been
doing (as so far, the only thing I've managed to do is commit crimes
against nature and the idle loop).

Jordan

-- 
Jordan Crouse
Senior Linux Engineer
AMD - Personal Connectivity Solutions Group
<www.amd.com/embeddedprocessors>





[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
  2005-03-18  2:32 comments on irc log Benjamin Herrenschmidt
  2005-03-18 16:56 ` Alan Stern
  2005-03-18 18:13 ` Pavel Machek
@ 2005-03-23 19:46 ` David Brownell
       [not found]   ` <200503231146.17105.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  2 siblings, 1 reply; 24+ messages in thread
From: David Brownell @ 2005-03-23 19:46 UTC (permalink / raw)
  To: linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 1143 bytes --]

On Thursday 17 March 2005 6:32 pm, Benjamin Herrenschmidt wrote:

> BTW. David, can't your clock stuff be simply represented in terms of bus & device states as
> well ? In most case, it's not PCI, so it could be defined as special bus types with states
> matching the various clock states.

It's not "my" clock stuff ... I don't design hundreds of chips!  ;)

Yes and no.  I described a canonical situation in IRC, as applied to
certain devices.  A given device state has consequences in terms of
clock usage.  Using ACPI terminology, it'd be straightforward to
support D0 (operational), D2 (suspend), and D3 (poweroff) states
for many peripherals.  And folk are using platform_bus for this,
nothing special is necessary.

Thing is, it's the system power states that are placing clock
constraints on devices.  On OMAP, going into "deep sleep" means
you've got to stop using the 48 MHz clock.  For "big sleep",
you can keep using that clock.  Most other CPUs have similar
constraints:  multiple system states, defined primarily by
clock usage.  (Discussing off-chip peripherals like LCDs and
backlights adds some orthogonal dimensions.

- Dave

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]   ` <200503231146.17105.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-23 19:53     ` David Brownell
       [not found]       ` <200503231153.48230.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: David Brownell @ 2005-03-23 19:53 UTC (permalink / raw)
  To: linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 681 bytes --]

On Wednesday 23 March 2005 11:46 am, David Brownell wrote:

> Thing is, it's the system power states that are placing clock
> constraints on devices.  On OMAP, going into "deep sleep" means
> you've got to stop using the 48 MHz clock.  For "big sleep",
> you can keep using that clock.  Most other CPUs have similar
> constraints:  multiple system states, defined primarily by
> clock usage.  


So to focus on one point:  "pm_message_t" doesn't work well
at all, since it doesn't have a way to identify the target
system power state, and drivers thus have no way to see if
they should drop their requests for those clocks or whether
the hardware should keep working away.

- Dave

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]       ` <200503231153.48230.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-23 20:16         ` Todd Poynor
       [not found]           ` <4241CE9B.5050604-Igf4POYTYCDQT0dZR+AlfA@public.gmane.org>
  2005-03-23 21:08         ` Pavel Machek
  1 sibling, 1 reply; 24+ messages in thread
From: Todd Poynor @ 2005-03-23 20:16 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg

David Brownell wrote:

>>Thing is, it's the system power states that are placing clock
>>constraints on devices.  On OMAP, going into "deep sleep" means
>>you've got to stop using the 48 MHz clock.  For "big sleep",
>>you can keep using that clock.  Most other CPUs have similar
>>constraints:  multiple system states, defined primarily by
>>clock usage.  
> 
> 
> 
> So to focus on one point:  "pm_message_t" doesn't work well
> at all, since it doesn't have a way to identify the target
> system power state, and drivers thus have no way to see if
> they should drop their requests for those clocks or whether
> the hardware should keep working away.

If I've followed the discussion correctly, it sounds like a lot of the 
system intelligence is targeted at the bus driver level, and the current 
generic platform bus driver for embedded onchip devices will probably 
become something very tied to the particular platform.  If so, then at 
least the bus driver would need to be told of the system state, which 
can code the logic for figuring out which devices must be stopped prior 
to entering a state, and device drivers can simply follow orders to 
suspend.  But I suppose there's some cases in which a device driver may 
have options more complicated than run/suspend in the face of changes in 
clock gating, so having the info available to all drivers could be 
useful even in that situation.

-- 
Todd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]           ` <4241CE9B.5050604-Igf4POYTYCDQT0dZR+AlfA@public.gmane.org>
@ 2005-03-23 20:46             ` David Brownell
       [not found]               ` <200503231246.05656.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: David Brownell @ 2005-03-23 20:46 UTC (permalink / raw)
  To: Todd Poynor; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 2244 bytes --]

On Wednesday 23 March 2005 12:16 pm, Todd Poynor wrote:
> David Brownell wrote:
> 
> > 
> > So to focus on one point:  "pm_message_t" doesn't work well
> > at all, since it doesn't have a way to identify the target
> > system power state, and drivers thus have no way to see if
> > they should drop their requests for those clocks or whether
> > the hardware should keep working away.
> 
> If I've followed the discussion correctly, it sounds like a lot of the 
> system intelligence is targeted at the bus driver level, and the current 
> generic platform bus driver for embedded onchip devices will probably 
> become something very tied to the particular platform. 

There's no driver for the "platform bus", just individual drivers
that hang off that data structure.   So nothing to target with
any "intelligence"; it's safe from Homeland Security.  :)


> If so, then at  
> least the bus driver would need to be told of the system state, which 
> can code the logic for figuring out which devices must be stopped prior 
> to entering a state, and device drivers can simply follow orders to 
> suspend.

I think it suffices to have the drivers know what to do:
"If going to system state X, then drop request for clock Y".

You seem to suggest something that knows which drivers exist,
and then goes to talk to them.  This isn't IMO a problem that
needs to be centrally managed, and I think it'd work better
to just let them do the right thing ... easier to make one
driver coordinate such stuff internally, than to make it
cope with various externally-induced surprises.


> But I suppose there's some cases in which a device driver may  
> have options more complicated than run/suspend in the face of changes in 
> clock gating, so having the info available to all drivers could be 
> useful even in that situation.

That too.  But, remember that in this case <asm/hardware/clock.h>
isn't structured to give clock change notifications to drivers; it's
not a cpufreq style thing, as a rule.  The model is that drivers
manage their own clocks, there's no scale() callback from any
central manager (like DPM does).  And the only question is when
they clk_unuse():  which system state is active after suspend().

- Dave


> 
> -- 
> Todd
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]       ` <200503231153.48230.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  2005-03-23 20:16         ` Todd Poynor
@ 2005-03-23 21:08         ` Pavel Machek
       [not found]           ` <20050323210835.GF30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  1 sibling, 1 reply; 24+ messages in thread
From: Pavel Machek @ 2005-03-23 21:08 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 1012 bytes --]

Hi!

> > Thing is, it's the system power states that are placing clock
> > constraints on devices.  On OMAP, going into "deep sleep" means
> > you've got to stop using the 48 MHz clock.  For "big sleep",
> > you can keep using that clock.  Most other CPUs have similar
> > constraints:  multiple system states, defined primarily by
> > clock usage.  
> 
> 
> So to focus on one point:  "pm_message_t" doesn't work well
> at all, since it doesn't have a way to identify the target
> system power state, and drivers thus have no way to see if
> they should drop their requests for those clocks or whether
> the hardware should keep working away.

Well, in current model, drivers shoudl stop all the activity they can
("deep sleep"). If you want to add support for "big sleep", then you
should probably use flags... I guess calling "big sleep" standby is
okay...

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]           ` <20050323210835.GF30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-23 21:33             ` David Brownell
       [not found]               ` <200503231333.22647.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: David Brownell @ 2005-03-23 21:33 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 2059 bytes --]

On Wednesday 23 March 2005 1:08 pm, Pavel Machek wrote:
> Hi!
> 
> > > Thing is, it's the system power states that are placing clock
> > > constraints on devices.  On OMAP, going into "deep sleep" means
> > > you've got to stop using the 48 MHz clock.  For "big sleep",
> > > you can keep using that clock.  Most other CPUs have similar
> > > constraints:  multiple system states, defined primarily by
> > > clock usage.  
> > 
> > 
> > So to focus on one point:  "pm_message_t" doesn't work well
> > at all, since it doesn't have a way to identify the target
> > system power state, and drivers thus have no way to see if
> > they should drop their requests for those clocks or whether
> > the hardware should keep working away.
> 
> Well, in current model,

That is, after pm_message_t change.  That represents a loss of
functionality.  Previously drivers received a target system
sleep state, and could make such deductions easily:  anything
like a PCI D3cold ("4") means maximal power off, anything like
a PCI D3hot ("3") is less aggressive, and so on.  (Not that all
drivers behaved right, or that the different incarnations of
the "pm core" code used "3" vs "4" sanely, etc.)

So maybe one question for tomorrow should be how we'll restore
that temporarily-list functionality.


> drivers shoudl stop all the activity they can 
> ("deep sleep"). If you want to add support for "big sleep", then you
> should probably use flags... I guess calling "big sleep" standby is
> okay...

That's the loss of functionality.  Previously drivers didn't need
to "stop all the activity they can" (PCI D3cold = 4), they also
had options that didn't assume swsusp poweroff (PCI D3hot = 3).

I don't know what you mean by "flags".  Mapping "big sleep" to a
"standby" might make sense, specifically for that one architecture,
but that doesn't seem like it'd address the  general issue.  What
if there are more than two such non-"disk" system states that need
support, for example?  Or about system states that relate to more
factors than just the CPU/SOC states?

- Dave

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]               ` <200503231333.22647.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-23 21:53                 ` Pavel Machek
       [not found]                   ` <20050323215330.GJ30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Pavel Machek @ 2005-03-23 21:53 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: text/plain, Size: 15590 bytes --]

Hi!

> > > > Thing is, it's the system power states that are placing clock
> > > > constraints on devices.  On OMAP, going into "deep sleep" means
> > > > you've got to stop using the 48 MHz clock.  For "big sleep",
> > > > you can keep using that clock.  Most other CPUs have similar
> > > > constraints:  multiple system states, defined primarily by
> > > > clock usage.  
> > > 
> > > 
> > > So to focus on one point:  "pm_message_t" doesn't work well
> > > at all, since it doesn't have a way to identify the target
> > > system power state, and drivers thus have no way to see if
> > > they should drop their requests for those clocks or whether
> > > the hardware should keep working away.
> > 
> > Well, in current model,
> 
> That is, after pm_message_t change.  That represents a loss of
> functionality.  Previously drivers received a target system
> sleep state, and could make such deductions easily:  anything
> like a PCI D3cold ("4") means maximal power off, anything like
> a PCI D3hot ("3") is less aggressive, and so on.  (Not that all
> drivers behaved right, or that the different incarnations of
> the "pm core" code used "3" vs "4" sanely, etc.)
> 
> So maybe one question for tomorrow should be how we'll restore
> that temporarily-list functionality.

pm_message_t change is not done, yet. It is going to be structure, and
it will have second field "flags" that will address your concerns.

Here's patch doing that... It may even make it into 2.6.12.

								Pavel

--- clean/drivers/base/power/resume.c	2004-12-25 13:34:59.000000000 +0100
+++ linux/drivers/base/power/resume.c	2005-03-22 12:20:53.000000000 +0100
@@ -41,7 +41,7 @@
 		list_add_tail(entry, &dpm_active);
 
 		up(&dpm_list_sem);
-		if (!dev->power.prev_state)
+		if (!dev->power.prev_state.event)
 			resume_device(dev);
 		down(&dpm_list_sem);
 		put_device(dev);
--- clean/drivers/base/power/runtime.c	2005-01-12 11:07:39.000000000 +0100
+++ linux/drivers/base/power/runtime.c	2005-03-22 12:20:53.000000000 +0100
@@ -13,10 +13,10 @@
 static void runtime_resume(struct device * dev)
 {
 	dev_dbg(dev, "resuming\n");
-	if (!dev->power.power_state)
+	if (!dev->power.power_state.event)
 		return;
 	if (!resume_device(dev))
-		dev->power.power_state = 0;
+		dev->power.power_state = PMSG_ON;
 }
 
 
@@ -49,10 +49,10 @@
 	int error = 0;
 
 	down(&dpm_sem);
-	if (dev->power.power_state == state)
+	if (dev->power.power_state.event == state.event)
 		goto Done;
 
-	if (dev->power.power_state)
+	if (dev->power.power_state.event)
 		runtime_resume(dev);
 
 	if (!(error = suspend_device(dev, state)))
--- clean/drivers/base/power/shutdown.c	2004-08-15 19:14:55.000000000 +0200
+++ linux/drivers/base/power/shutdown.c	2005-03-22 12:20:53.000000000 +0100
@@ -29,7 +29,8 @@
 			dev->driver->shutdown(dev);
 		return 0;
 	}
-	return dpm_runtime_suspend(dev, dev->detach_state);
+	/* FIXME */
+	return dpm_runtime_suspend(dev, PMSG_FREEZE);
 }
 
 
--- clean/drivers/base/power/suspend.c	2005-01-12 11:07:39.000000000 +0100
+++ linux/drivers/base/power/suspend.c	2005-03-22 12:20:53.000000000 +0100
@@ -43,7 +43,7 @@
 
 	dev->power.prev_state = dev->power.power_state;
 
-	if (dev->bus && dev->bus->suspend && !dev->power.power_state)
+	if (dev->bus && dev->bus->suspend && (!dev->power.power_state.event))
 		error = dev->bus->suspend(dev, state);
 
 	return error;
--- clean/drivers/base/power/sysfs.c	2004-08-15 19:14:55.000000000 +0200
+++ linux/drivers/base/power/sysfs.c	2005-03-22 12:20:53.000000000 +0100
@@ -26,19 +26,20 @@
 
 static ssize_t state_show(struct device * dev, char * buf)
 {
-	return sprintf(buf, "%u\n", dev->power.power_state);
+	return sprintf(buf, "%u\n", dev->power.power_state.event);
 }
 
 static ssize_t state_store(struct device * dev, const char * buf, size_t n)
 {
-	u32 state;
+	pm_message_t state;
 	char * rest;
 	int error = 0;
 
-	state = simple_strtoul(buf, &rest, 10);
+	state.event = simple_strtoul(buf, &rest, 10);
+//	state.flags = PFL_RUNTIME;
 	if (*rest)
 		return -EINVAL;
-	if (state)
+	if (state.event)
 		error = dpm_runtime_suspend(dev, state);
 	else
 		dpm_runtime_resume(dev);
--- clean/drivers/ide/ide.c	2005-03-19 00:31:23.000000000 +0100
+++ linux/drivers/ide/ide.c	2005-03-22 12:20:53.000000000 +0100
@@ -1390,7 +1390,7 @@
 	rq.special = &args;
 	rq.pm = &rqpm;
 	rqpm.pm_step = ide_pm_state_start_suspend;
-	rqpm.pm_state = state;
+	rqpm.pm_state = state.event;
 
 	return ide_do_drive_cmd(drive, &rq, ide_wait);
 }
@@ -1409,7 +1409,7 @@
 	rq.special = &args;
 	rq.pm = &rqpm;
 	rqpm.pm_step = ide_pm_state_start_resume;
-	rqpm.pm_state = 0;
+	rqpm.pm_state = PM_EVENT_ON;
 
 	return ide_do_drive_cmd(drive, &rq, ide_head_wait);
 }
--- clean/drivers/pci/pci.c	2005-03-19 00:31:43.000000000 +0100
+++ linux/drivers/pci/pci.c	2005-03-22 12:20:53.000000000 +0100
@@ -312,22 +312,24 @@
 /**
  * pci_choose_state - Choose the power state of a PCI device
  * @dev: PCI device to be suspended
- * @state: target sleep state for the whole system
+ * @state: target sleep state for the whole system. This is the value
+ *	that is passed to suspend() function.
  *
  * Returns PCI power state suitable for given device and given system
  * message.
  */
 
-pci_power_t pci_choose_state(struct pci_dev *dev, u32 state)
+pci_power_t pci_choose_state(struct pci_dev *dev, pm_message_t state)
 {
-	if (!pci_find_capability(dev, PCI_CAP_ID_PM))
+	switch (state.event) {
+	case PM_EVENT_ON:
 		return PCI_D0;
-
-	switch (state) {
-	case 0:	return PCI_D0;
-	case 2: return PCI_D2;
-	case 3: return PCI_D3hot;
-	default: BUG();
+	case PM_EVENT_FREEZE:
+	case PM_EVENT_SUSPEND:
+		return PCI_D3hot;
+	default: 
+		printk("They asked me for state %d\n", state.event);
+		BUG();
 	}
 	return PCI_D0;
 }
--- clean/drivers/usb/core/hcd-pci.c	2005-03-19 00:31:51.000000000 +0100
+++ linux/drivers/usb/core/hcd-pci.c	2005-03-22 12:20:53.000000000 +0100
@@ -68,7 +68,7 @@
 	if (pci_enable_device (dev) < 0)
 		return -ENODEV;
 	dev->current_state = 0;
-	dev->dev.power.power_state = 0;
+	dev->dev.power.power_state.event = 0;
 	
         if (!dev->irq) {
         	dev_err (&dev->dev,
@@ -291,9 +294,6 @@
 		break;
 	}
 
-	/* update power_state **ONLY** to make sysfs happier */
-	if (retval == 0)
-		dev->dev.power.power_state = state;
 	return retval;
 }
 EXPORT_SYMBOL (usb_hcd_pci_suspend);
--- clean/drivers/usb/core/hub.c	2005-03-19 00:31:51.000000000 +0100
+++ linux/drivers/usb/core/hub.c	2005-03-22 12:20:53.000000000 +0100
@@ -1557,7 +1557,7 @@
 			struct usb_driver	*driver;
 
 			intf = udev->actconfig->interface[i];
-			if (state <= intf->dev.power.power_state)
+			if (state.event <= intf->dev.power.power_state.event)
 				continue;
 			if (!intf->dev.driver)
 				continue;
@@ -1565,11 +1565,11 @@
 
 			if (driver->suspend) {
 				status = driver->suspend(intf, state);
-				if (intf->dev.power.power_state != state
+				if (intf->dev.power.power_state.event != state.event
 						|| status)
 					dev_err(&intf->dev,
 						"suspend %d fail, code %d\n",
-						state, status);
+						state.event, status);
 			}
 
 			/* only drivers with suspend() can ever resume();
@@ -1582,7 +1582,7 @@
 			 * since we know every driver's probe/disconnect works
 			 * even for drivers that can't suspend.
 			 */
-			if (!driver->suspend || state > PM_SUSPEND_MEM) {
+			if (!driver->suspend || state.event > PM_EVENT_FREEZE) {
 #if 1
 				dev_warn(&intf->dev, "resume is unsafe!\n");
 #else
@@ -1603,7 +1603,7 @@
 	 * policies (when HNP doesn't apply) once we have mechanisms to
 	 * turn power back on!  (Likely not before 2.7...)
 	 */
-	if (state > PM_SUSPEND_MEM) {
+	if (state.event > PM_EVENT_FREEZE) {
 		dev_warn(&udev->dev, "no poweroff yet, suspending instead\n");
 	}
 
@@ -1718,7 +1718,7 @@
 			struct usb_driver	*driver;
 
 			intf = udev->actconfig->interface[i];
-			if (intf->dev.power.power_state == PM_SUSPEND_ON)
+			if (intf->dev.power.power_state.event == PM_EVENT_ON)
 				continue;
 			if (!intf->dev.driver) {
 				/* FIXME maybe force to alt 0 */
@@ -1732,11 +1732,11 @@
 
 			/* can we do better than just logging errors? */
 			status = driver->resume(intf);
-			if (intf->dev.power.power_state != PM_SUSPEND_ON
+			if (intf->dev.power.power_state.event != PM_EVENT_ON
 					|| status)
 				dev_dbg(&intf->dev,
 					"resume fail, state %d code %d\n",
-					intf->dev.power.power_state, status);
+					intf->dev.power.power_state.event, status);
 		}
 		status = 0;
 
@@ -1917,7 +1917,7 @@
 	unsigned		port1;
 	int			status;
 
-	if (intf->dev.power.power_state == PM_SUSPEND_ON)
+	if (intf->dev.power.power_state.event == PM_EVENT_ON)
 		return 0;
 
 	for (port1 = 1; port1 <= hdev->maxchild; port1++) {
--- clean/drivers/usb/core/usb.c	2005-03-19 00:31:51.000000000 +0100
+++ linux/drivers/usb/core/usb.c	2005-03-22 12:20:53.000000000 +0100
@@ -1367,7 +1367,7 @@
 	driver = to_usb_driver(dev->driver);
 
 	/* there's only one USB suspend state */
-	if (intf->dev.power.power_state)
+	if (intf->dev.power.power_state.event)
 		return 0;
 
 	if (driver->suspend)
--- clean/drivers/usb/host/ehci-dbg.c	2005-01-12 11:07:40.000000000 +0100
+++ linux/drivers/usb/host/ehci-dbg.c	2005-03-22 12:20:53.000000000 +0100
@@ -641,7 +641,7 @@
 
 	spin_lock_irqsave (&ehci->lock, flags);
 
-	if (bus->controller->power.power_state) {
+	if (bus->controller->power.power_state.event) {
 		size = scnprintf (next, size,
 			"bus %s, device %s (driver " DRIVER_VERSION ")\n"
 			"SUSPENDED (no register access)\n",
--- clean/drivers/usb/host/ohci-dbg.c	2005-03-19 00:31:53.000000000 +0100
+++ linux/drivers/usb/host/ohci-dbg.c	2005-03-22 12:20:53.000000000 +0100
@@ -625,7 +625,7 @@
 		hcd->self.controller->bus_id,
 		hcd_name);
 
-	if (bus->controller->power.power_state) {
+	if (bus->controller->power.power_state.event) {
 		size -= scnprintf (next, size,
 			"SUSPENDED (no register access)\n");
 		goto done;
--- clean/drivers/usb/host/sl811-hcd.c	2005-03-19 00:31:53.000000000 +0100
+++ linux/drivers/usb/host/sl811-hcd.c	2005-03-22 12:20:53.000000000 +0100
@@ -1781,9 +1781,9 @@
 	if (phase != SUSPEND_POWER_DOWN)
 		return retval;
 
-	if (state <= PM_SUSPEND_MEM)
+	if (state.event == PM_EVENT_FREEZE)
 		retval = sl811h_hub_suspend(hcd);
-	else
+	else if (state.event == PM_EVENT_SUSPEND)
 		port_power(sl811, 0);
 	if (retval == 0)
 		dev->power.power_state = state;
@@ -1802,14 +1802,14 @@
 	/* with no "check to see if VBUS is still powered" board hook,
 	 * let's assume it'd only be powered to enable remote wakeup.
 	 */
-	if (dev->power.power_state > PM_SUSPEND_MEM
+	if (dev->power.power_state.event == PM_EVENT_SUSPEND
 			|| !hcd->can_wakeup) {
 		sl811->port1 = 0;
 		port_power(sl811, 1);
 		return 0;
 	}
 
-	dev->power.power_state = PM_SUSPEND_ON;
+	dev->power.power_state = PMSG_ON;
 	return sl811h_hub_resume(hcd);
 }
 
--- clean/drivers/video/aty/atyfb_base.c	2005-03-19 00:31:59.000000000 +0100
+++ linux/drivers/video/aty/atyfb_base.c	2005-03-22 12:20:53.000000000 +0100
@@ -2070,12 +2070,12 @@
 	struct fb_info *info = pci_get_drvdata(pdev);
 	struct atyfb_par *par = (struct atyfb_par *) info->par;
 
-	if (pdev->dev.power.power_state == 0)
+	if (pdev->dev.power.power_state.event == PM_EVENT_ON)
 		return 0;
 
 	acquire_console_sem();
 
-	if (pdev->dev.power.power_state == 2)
+	if (pdev->dev.power.power_state.event == 2)
 		aty_power_mgmt(0, par);
 	par->asleep = 0;
 
--- clean/drivers/video/aty/radeon_pm.c	2005-03-19 00:31:59.000000000 +0100
+++ linux/drivers/video/aty/radeon_pm.c	2005-03-22 12:20:53.000000000 +0100
@@ -2519,33 +2519,26 @@
 }
 
 
-static/*extern*/ int susdisking = 0;
-
-int radeonfb_pci_suspend(struct pci_dev *pdev, u32 state)
+int radeonfb_pci_suspend(struct pci_dev *pdev, pm_message_t state)
 {
         struct fb_info *info = pci_get_drvdata(pdev);
         struct radeonfb_info *rinfo = info->par;
 	u8 agp;
 	int i;
 
-	if (state == pdev->dev.power.power_state)
+	if (state.event == pdev->dev.power.power_state.event)
 		return 0;
 
 	printk(KERN_DEBUG "radeonfb (%s): suspending to state: %d...\n",
-	       pci_name(pdev), state);
+	       pci_name(pdev), state.event);
 
 	/* For suspend-to-disk, we cheat here. We don't suspend anything and
 	 * let fbcon continue drawing until we are all set. That shouldn't
 	 * really cause any problem at this point, provided that the wakeup
 	 * code knows that any state in memory may not match the HW
 	 */
-	if (state != PM_SUSPEND_MEM)
-		goto done;
-	if (susdisking) {
-		printk("radeonfb (%s): suspending to disk but state = %d\n",
-		       pci_name(pdev), state);
+	if (state.event == PM_EVENT_FREEZE)
 		goto done;
-	}
 
 	acquire_console_sem();
 
@@ -2637,7 +2630,7 @@
         struct radeonfb_info *rinfo = info->par;
 	int rc = 0;
 
-	if (pdev->dev.power.power_state == 0)
+	if (pdev->dev.power.power_state.event == PM_EVENT_ON)
 		return 0;
 
 	if (rinfo->no_schedule) {
@@ -2647,7 +2640,7 @@
 		acquire_console_sem();
 
 	printk(KERN_DEBUG "radeonfb (%s): resuming from state: %d...\n",
-	       pci_name(pdev), pdev->dev.power.power_state);
+	       pci_name(pdev), pdev->dev.power.power_state.event);
 
 
 	if (pci_enable_device(pdev)) {
@@ -2658,7 +2651,7 @@
 	}
 	pci_set_master(pdev);
 
-	if (pdev->dev.power.power_state == PM_SUSPEND_MEM) {
+	if (pdev->dev.power.power_state.event == PM_EVENT_SUSPEND) {
 		/* Wakeup chip. Check from config space if we were powered off
 		 * (todo: additionally, check CLK_PIN_CNTL too)
 		 */
--- clean/drivers/video/i810/i810_main.c	2005-03-19 00:32:00.000000000 +0100
+++ linux/drivers/video/i810/i810_main.c	2005-03-22 12:20:53.000000000 +0100
@@ -1492,18 +1492,18 @@
 /***********************************************************************
  *                         Power Management                            *
  ***********************************************************************/
-static int i810fb_suspend(struct pci_dev *dev, u32 state)
+static int i810fb_suspend(struct pci_dev *dev, pm_message_t state)
 {
 	struct fb_info *info = pci_get_drvdata(dev);
 	struct i810fb_par *par = (struct i810fb_par *) info->par;
 	int blank = 0, prev_state = par->cur_state;
 
-	if (state == prev_state)
+	if (state.event == prev_state)
 		return 0;
 
-	par->cur_state = state;
+	par->cur_state = state.event;
 
-	switch (state) {
+	switch (state.event) {
 	case 1:
 		blank = VESA_VSYNC_SUSPEND;
 		break;
@@ -1524,7 +1524,7 @@
 		pci_disable_device(dev);
 	}
 	pci_save_state(dev);
-	pci_set_power_state(dev, state);
+	pci_set_power_state(dev, pci_choose_state(dev, state));
 
 	return 0;
 }
--- clean/include/linux/pm.h	2005-03-19 00:32:25.000000000 +0100
+++ linux/include/linux/pm.h	2005-03-22 12:25:54.000000000 +0100
@@ -185,7 +185,10 @@
 
 struct device;
 
-typedef u32 __bitwise pm_message_t;
+typedef struct pm_message {
+	int event;
+	int flags;
+} pm_message_t;
 
 /*
  * There are 4 important states driver can be in:
@@ -205,9 +208,15 @@
  * or something similar soon.
  */
 
-#define PMSG_FREEZE	((__force pm_message_t) 3)
-#define PMSG_SUSPEND	((__force pm_message_t) 3)
-#define PMSG_ON		((__force pm_message_t) 0)
+#define PM_EVENT_ON 0
+#define PM_EVENT_FREEZE 1
+#define PM_EVENT_SUSPEND 2
+
+#define PFL_RUNTIME 1
+
+#define PMSG_FREEZE	({struct pm_message m; m.event = PM_EVENT_FREEZE; m.flags = 0; m; })
+#define PMSG_SUSPEND	({struct pm_message m; m.event = PM_EVENT_SUSPEND; m.flags = 0; m; })
+#define PMSG_ON		({struct pm_message m; m.event = PM_EVENT_ON; m.flags = 0; m; })
 
 struct dev_pm_info {
 	pm_message_t		power_state;

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]               ` <200503231246.05656.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
@ 2005-03-24  1:57                 ` Todd Poynor
  0 siblings, 0 replies; 24+ messages in thread
From: Todd Poynor @ 2005-03-24  1:57 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg

David Brownell wrote:
> On Wednesday 23 March 2005 12:16 pm, Todd Poynor wrote:

>>If so, then at  
>>least the bus driver would need to be told of the system state, which 
>>can code the logic for figuring out which devices must be stopped prior 
>>to entering a state, and device drivers can simply follow orders to 
>>suspend.
> 
> 
> I think it suffices to have the drivers know what to do:
> "If going to system state X, then drop request for clock Y".
> 
> You seem to suggest something that knows which drivers exist,
> and then goes to talk to them.  This isn't IMO a problem that
> needs to be centrally managed, and I think it'd work better
> to just let them do the right thing ... easier to make one
> driver coordinate such stuff internally, than to make it
> cope with various externally-induced surprises.

It was my try at taking some of the other comments I've read on moving 
smarts to the PCI et al busses and applying them to the platform bus, 
looks like that's not what's in the works.  And it would indeed 
presuppose board-specific code that knows about its onchip/onboard 
drivers (for example, PowerPC 4xx at least used to centrally manage 
various aspects of its onchip devices).

So the platform bus is not intended to encapsulate the platform logic 
for device clocking interactions with system power states, this goes 
into the drivers, which should be fine.  Depending on the eventual 
implementation perhaps it could be the case that some common drivers 
would need board-specific logic to deal with these interactions, but if 
so then that can always be dealt with.

-- 
Todd

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: comments on irc log
       [not found]                   ` <20050323215330.GJ30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
@ 2005-03-24 18:40                     ` Patrick Mochel
  0 siblings, 0 replies; 24+ messages in thread
From: Patrick Mochel @ 2005-03-24 18:40 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-pm-qjLDD68F18O7TbgM5vRIOg

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1325 bytes --]



On Wed, 23 Mar 2005, Pavel Machek wrote:

A couple of questions:



> --- clean/drivers/base/power/shutdown.c	2004-08-15 19:14:55.000000000 +0200
> +++ linux/drivers/base/power/shutdown.c	2005-03-22 12:20:53.000000000 +0100
> @@ -29,7 +29,8 @@
>  			dev->driver->shutdown(dev);
>  		return 0;
>  	}
> -	return dpm_runtime_suspend(dev, dev->detach_state);
> +	/* FIXME */
> +	return dpm_runtime_suspend(dev, PMSG_FREEZE);

Why is this a FIXME? Mind adding a bit more descriptive of a comment
there?

> --- clean/drivers/base/power/sysfs.c	2004-08-15 19:14:55.000000000 +0200
> +++ linux/drivers/base/power/sysfs.c	2005-03-22 12:20:53.000000000 +0100
> @@ -26,19 +26,20 @@
>
>  static ssize_t state_show(struct device * dev, char * buf)
>  {
> -	return sprintf(buf, "%u\n", dev->power.power_state);
> +	return sprintf(buf, "%u\n", dev->power.power_state.event);
>  }
>
>  static ssize_t state_store(struct device * dev, const char * buf, size_t n)
>  {
> -	u32 state;
> +	pm_message_t state;
>  	char * rest;
>  	int error = 0;
>
> -	state = simple_strtoul(buf, &rest, 10);
> +	state.event = simple_strtoul(buf, &rest, 10);
> +//	state.flags = PFL_RUNTIME;

What is PFL_RUNTIME?

This appears to be the only usage of the .flags field. Perhaps we should
omit that field (and this commented out line) for now?

Thanks,


	Pat


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2005-03-24 18:40 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-18  2:32 comments on irc log Benjamin Herrenschmidt
2005-03-18 16:56 ` Alan Stern
     [not found]   ` <Pine.LNX.4.44L0.0503181147110.1099-100000-3WpdWqXrU/qjv4eRiOYp3g@public.gmane.org>
2005-03-18 18:14     ` Pavel Machek
2005-03-18 23:07     ` Benjamin Herrenschmidt
2005-03-18 23:18       ` Pavel Machek
     [not found]         ` <20050318231801.GE24449-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-19  1:21           ` Benjamin Herrenschmidt
2005-03-19  3:23             ` Alan Stern
     [not found]               ` <Pine.LNX.4.44L0.0503182205040.30560-100000-pYrvlCTfrz9XsRXLowluHWD2FQJk+8+b@public.gmane.org>
2005-03-19 10:33                 ` Pavel Machek
     [not found]                   ` <20050319103351.GM24449-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-19 15:49                     ` Alan Stern
2005-03-19 12:02                 ` Benjamin Herrenschmidt
2005-03-19 10:32             ` Pavel Machek
2005-03-18 18:13 ` Pavel Machek
     [not found]   ` <20050318181317.GD18427-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-18 23:15     ` Benjamin Herrenschmidt
2005-03-21 20:06     ` Jordan Crouse
     [not found]       ` <20050321130612.135d726e-aftB2sG12IhaqnLngUycEA@public.gmane.org>
2005-03-21 20:03         ` Pavel Machek
2005-03-23 19:46 ` David Brownell
     [not found]   ` <200503231146.17105.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 19:53     ` David Brownell
     [not found]       ` <200503231153.48230.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 20:16         ` Todd Poynor
     [not found]           ` <4241CE9B.5050604-Igf4POYTYCDQT0dZR+AlfA@public.gmane.org>
2005-03-23 20:46             ` David Brownell
     [not found]               ` <200503231246.05656.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-24  1:57                 ` Todd Poynor
2005-03-23 21:08         ` Pavel Machek
     [not found]           ` <20050323210835.GF30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-23 21:33             ` David Brownell
     [not found]               ` <200503231333.22647.david-b-yBeKhBN/0LDR7s880joybQ@public.gmane.org>
2005-03-23 21:53                 ` Pavel Machek
     [not found]                   ` <20050323215330.GJ30704-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2005-03-24 18:40                     ` Patrick Mochel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox