* Re: [linux-pm] OpPoint summary
@ 2006-09-18 13:36 Scott E. Preece
2006-09-18 13:46 ` Pavel Machek
0 siblings, 1 reply; 5+ messages in thread
From: Scott E. Preece @ 2006-09-18 13:36 UTC (permalink / raw)
To: pavel; +Cc: daviado, linux-pm, linux-kernel
| From: Pavel Machek<pavel@ucw.cz>
|
| Hi!
|
| > >Care to resend your patches in the proper format, through email so that
| > >we can see them, and possibly get some testing in -mm if they look sane?
| >
| > Greg,
| > here's the patch that implements operating points for different
| > frequencies
| > for the speedstep-centrino line of processors. Operating points are created
| > in much the same manner that cpufreq tables are. This works for both
| > simple implementations like the centrino and more complex SoC systems
| > like the arm-pxa72x which has several clocks to control, and different clock
| > divisors and multipliers.
|
| > +static struct oppoint lowest = {
| > + .name = "lowest",
| > + .type = PM_FREQ_CHANGE,
| > + .frequency = 0,
| > + .voltage = 0,
| > + .latency = 15,
| > + .prepare_transition = cpufreq_prepare_transition,
| > + .transition = centrino_transition,
| > + .finish_transition = cpufreq_finish_transition,
| > +};
|
| We had nice, descriptive interface... with numbers. Now you want to
| introduce english state names... looks like a step back to me.
---
Well, a single number is fine if you're describing a scalar abstraction,
but an operating point is a vector. You can't assume that "399" is three
times "133" in performance or energy cost, so its "numberness" is simply
misleading.
scott
--
scott preece
motorola mobile devices, il67, 1800 s. oak st., champaign, il 61820
e-mail: preece@motorola.com fax: +1-217-384-8550
phone: +1-217-384-8589 cell: +1-217-433-6114 pager: 2174336114@vtext.com
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [linux-pm] OpPoint summary 2006-09-18 13:36 [linux-pm] OpPoint summary Scott E. Preece @ 2006-09-18 13:46 ` Pavel Machek 0 siblings, 0 replies; 5+ messages in thread From: Pavel Machek @ 2006-09-18 13:46 UTC (permalink / raw) To: Scott E. Preece; +Cc: daviado, linux-pm, linux-kernel H! > | > +static struct oppoint lowest = { > | > + .name = "lowest", > | > + .type = PM_FREQ_CHANGE, > | > + .frequency = 0, > | > + .voltage = 0, > | > + .latency = 15, > | > + .prepare_transition = cpufreq_prepare_transition, > | > + .transition = centrino_transition, > | > + .finish_transition = cpufreq_finish_transition, > | > +}; > | > | We had nice, descriptive interface... with numbers. Now you want to > | introduce english state names... looks like a step back to me. > --- > > Well, a single number is fine if you're describing a scalar abstraction, > but an operating point is a vector. You can't assume that "399" is three > times "133" in performance or energy cost, so its "numberness" is simply > misleading. "lowest" can simply be mapped to "0", with "low" mapped to "1", etc. I believe, using english names is wrong in this case. If you want to provide vectors... well provide the vectors. Is "medium" operating point 1GHz on cpu 0 and 2GHz on cpu 1, or is it 1.5 ghz on cpu 0 and 1.5 ghz on cpu 1? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: community PM requirements/issues and PowerOP [Was: Re: So, what's the status on the recent patches here?] @ 2006-09-11 8:20 Pavel Machek 2006-09-11 19:55 ` Pavel Machek 2006-09-11 21:00 ` Pavel Machek 0 siblings, 2 replies; 5+ messages in thread From: Pavel Machek @ 2006-09-11 8:20 UTC (permalink / raw) To: Eugeny S. Mints; +Cc: pm list, scott.preece Hi! On Mon 2006-09-11 11:57:28, Eugeny S. Mints wrote: > [snip] > >> Are you arguing that the cpufreq interface be morphed to support power > >> op applications? > > > > No. I'm arguing that > > > > * cpufreq interface should be used for changing cpu frequency > the patch set i sent out has cpufreq used for changing cpu frequency, > hasn't it? I was talking about kernel<->user interface. You did echo low > something to change CPU frequency, IIRC. > can we eventually start talking more close to the code rather than > speculating without it? Lets get kernel<->user interface right, first. You'll need to create Documentation/ entries for your interfaces, eventually, so lets do that, first, and then talk about code. Oh and it would be nice to cc lkml on that document, too. New kernel<->user interface is not decision taken lightly. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: community PM requirements/issues and PowerOP [Was: Re: So, what's the status on the recent patches here?] @ 2006-09-11 19:55 ` Pavel Machek 2006-09-11 20:53 ` Eugeny S. Mints 0 siblings, 1 reply; 5+ messages in thread From: Pavel Machek @ 2006-09-11 19:55 UTC (permalink / raw) To: Matthew Locke; +Cc: pm list, Preece Scott-PREECE Hi! > I know its confusing having oppoint (from Dave Singleton) and powerop > being discussed at the same time. However, I believe we (PowerOP) > have Yes, it is. > - PowerOP is only one layer (towards the bottom) in a power management > solution. > - PowerOP does *not* replace cpufreq PowerOP provides userland interface for changing processor frequency. That's bad -- duplicate interface. > - The PowerOP interface was discussed in detail on this list and we > haven't heard any negative comments. Eh? Was I on different list? > - We are not advocating the integration with sleep states. We want to > get the PowerOP interface accepted and then we can build on it. Good. > We have a few more comments from Greg to take care of and we can add a > Documentation/ file. Then I think its time to get the PowerOP patches > in the queue for acceptance. Any comments about this? Well, you'll only get good interface review when you have Documentation/ , and it needs to go to lkml before it goes to any queues. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: community PM requirements/issues and PowerOP [Was: Re: So, what's the status on the recent patches here?] 2006-09-11 19:55 ` Pavel Machek @ 2006-09-11 20:53 ` Eugeny S. Mints 0 siblings, 0 replies; 5+ messages in thread From: Eugeny S. Mints @ 2006-09-11 20:53 UTC (permalink / raw) To: Pavel Machek; +Cc: pm list, Preece Scott-PREECE Pavel Machek wrote: > Hi! > >> I know its confusing having oppoint (from Dave Singleton) and powerop >> being discussed at the same time. However, I believe we (PowerOP) >> have > > Yes, it is. > >> - PowerOP is only one layer (towards the bottom) in a power management >> solution. >> - PowerOP does *not* replace cpufreq > > PowerOP provides userland interface for changing processor > frequency. That's bad -- duplicate interface. Basically the biggest problem with cpufreq interface is that cpufreq has "chose predefined closest to a given frequency" functionality implemented in the kernel while there is _no_ any reason to have this functionality implemented in the kernel if we have sysfs interface exported by PowerOP in place - you just _have_ to keep all possible functionality out of the kernel. CPufreq interface is just subset of sysfs interface provided by PowerOP and _must_ be implemented in userspace on top of sysfs interface - this is the proper way to scape duplication. Such issue with cpufreq<->kernel userspace interface is consequence of the fact that cpufreq implements incorrect design of PM stack layers and interfaces. PowerOP solves this issues as well. > >> - The PowerOP interface was discussed in detail on this list and we >> haven't heard any negative comments. > > Eh? Was I on different list?vb dfgdfv > >> - We are not advocating the integration with sleep states. We want to >> get the PowerOP interface accepted and then we can build on it. > > Good. > >> We have a few more comments from Greg to take care of and we can add a >> Documentation/ file. Then I think its time to get the PowerOP patches >> in the queue for acceptance. Any comments about this? > > Well, you'll only get good interface review when you have > Documentation/ , and it needs to go to lkml before it goes to any > queues. PM stack is too complex and heavy part to go in such pieces thru lkml. i expect all linux pm experts to be on this list Eugeny > Pavel > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: community PM requirements/issues and PowerOP [Was: Re: So, what's the status on the recent patches here?] 2006-09-11 20:53 ` Eugeny S. Mints (?) @ 2006-09-11 21:00 ` Pavel Machek 2006-09-11 22:05 ` Eugeny S. Mints -1 siblings, 1 reply; 5+ messages in thread From: Pavel Machek @ 2006-09-11 21:00 UTC (permalink / raw) To: Eugeny S. Mints; +Cc: pm list, Preece Scott-PREECE > >>- PowerOP is only one layer (towards the bottom) in a power management > >>solution. > >>- PowerOP does *not* replace cpufreq > > > >PowerOP provides userland interface for changing processor > >frequency. That's bad -- duplicate interface. > Basically the biggest problem with cpufreq interface is that cpufreq has > "chose > predefined closest to a given frequency" functionality implemented in the > kernel while there is _no_ any reason to have this functionality > implemented in > the kernel if we have sysfs interface exported by PowerOP in place - you > just No, there is reason to keep that in kernel -- so that cpufreq userspace interface can be kept, and so that resulting kernel<->user interface is not ugly. > of the fact that cpufreq implements incorrect design of PM stack layers and > interfaces. PowerOP solves this issues as well. Actually I believe that cpufreq implements correct design and PowerOP got it wrong. > >Well, you'll only get good interface review when you have > >Documentation/ , and it needs to go to lkml before it goes to any > >queues. > PM stack is too complex and heavy part to go in such pieces thru lkml. i > expect all linux pm experts to be on this list "We are too lazy to make the code clean enough / well enough explained to survive lkml review". Your interface impacts everyone, so everyone should have chance to comment on it. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: community PM requirements/issues and PowerOP [Was: Re: So, what's the status on the recent patches here?] 2006-09-11 21:00 ` Pavel Machek @ 2006-09-11 22:05 ` Eugeny S. Mints 2006-09-11 22:56 ` cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP] Pavel Machek 0 siblings, 1 reply; 5+ messages in thread From: Eugeny S. Mints @ 2006-09-11 22:05 UTC (permalink / raw) To: Pavel Machek; +Cc: pm list, Preece Scott-PREECE Pavel Machek wrote: >>>> - PowerOP is only one layer (towards the bottom) in a power management >>>> solution. >>>> - PowerOP does *not* replace cpufreq >>> PowerOP provides userland interface for changing processor >>> frequency. That's bad -- duplicate interface. >> Basically the biggest problem with cpufreq interface is that cpufreq has >> "chose >> predefined closest to a given frequency" functionality implemented in the >> kernel while there is _no_ any reason to have this functionality >> implemented in >> the kernel if we have sysfs interface exported by PowerOP in place - you >> just > > No, there is reason to keep that in kernel -- so that cpufreq > userspace interface can be kept, and so that resulting kernel<->user > interface is not ugly. Cpuferq defines cpufreq_frequency_table structure in arch independent header while it's arch dependent data structure. A lot of code is built around this invalid basic brick and therefore is invalid: cpufreq tables, interface with cpu freq drivers, etc. Notion of transition latency as it defined by cpufreq is wrong because it's not a function of cpu type but function of current and next operating point. no runtime control on operating points set. it's always the same set of operating points for all system cpus in smp case and no way to define different sets or track any dependencies in case say multi core cpus. insufficient kernel<->user space interface to handle embedded requirements and no way to extend it within current design. Shall I continue? Why should then anyone want to keep cpufreq userspace interface just due to keep it? > >> of the fact that cpufreq implements incorrect design of PM stack layers and >> interfaces. PowerOP solves this issues as well. > > Actually I believe that cpufreq implements correct design and PowerOP > got it wrong. PowerOP/cufreq integration patch set contains very details explanation of the cpufreq design issues and POwerOP solutions. Comment there is you feel it's wrong. Eugeny > >>> Well, you'll only get good interface review when you have >>> Documentation/ , and it needs to go to lkml before it goes to any >>> queues. >> PM stack is too complex and heavy part to go in such pieces thru lkml. i >> expect all linux pm experts to be on this list > > "We are too lazy to make the code clean enough / well enough explained > to survive lkml review". > > Your interface impacts everyone, so everyone should have chance to > comment on it. > Pavel ^ permalink raw reply [flat|nested] 5+ messages in thread
* cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP] 2006-09-11 22:05 ` Eugeny S. Mints @ 2006-09-11 22:56 ` Pavel Machek 2006-09-12 0:17 ` Mark Gross 0 siblings, 1 reply; 5+ messages in thread From: Pavel Machek @ 2006-09-11 22:56 UTC (permalink / raw) To: kernel list, Eugeny S. Mints; +Cc: pm list, Preece Scott-PREECE Hi! Just for the record... this goes out to the lkml. This discussion was internal for way too long. (for interested lkml readers, I'm sure linux-pm mailing list has public archive somewhere). On Tue 2006-09-12 02:05:26, Eugeny S. Mints wrote: > Pavel Machek wrote: > >>>>- PowerOP is only one layer (towards the bottom) in a power management > >>>>solution. > >>>>- PowerOP does *not* replace cpufreq > >>>PowerOP provides userland interface for changing processor > >>>frequency. That's bad -- duplicate interface. > >>Basically the biggest problem with cpufreq interface is that cpufreq has > >>"chose > >>predefined closest to a given frequency" functionality implemented in the > >>kernel while there is _no_ any reason to have this functionality > >>implemented in > >>the kernel if we have sysfs interface exported by PowerOP in place - you > >>just > > > >No, there is reason to keep that in kernel -- so that cpufreq > >userspace interface can be kept, and so that resulting kernel<->user > >interface is not ugly. > Cpuferq defines cpufreq_frequency_table structure in arch independent > header while it's arch dependent data structure. A lot of code is built > around this invalid basic brick and therefore is invalid: cpufreq tables, > interface with cpu freq drivers, etc. Notion of transition latency as it > defined by cpufreq is wrong because it's not a function of cpu type but > function of current and next operating point. no runtime control on > operating points set. it's always the same set of operating points for all > system cpus in smp case and no way to define different sets or track any > dependencies in case say multi core cpus. insufficient kernel<->user space > interface to handle embedded requirements and no way to extend it within > current design. Shall I continue? Why should then anyone want to keep > cpufreq userspace interface just due to keep it? Yes, please continue. I do not think we can just rip cpufreq interface out of kernel, and I do not think it is as broken as you claim it is. Ripping interface out of kernel takes years. I'm sure cpufreq_frequency_table could be moved to asm/ header if you felt strongly about that. I believe we need to fix cpufreq if it is broken for embedded cases. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP] 2006-09-11 22:56 ` cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP] Pavel Machek @ 2006-09-12 0:17 ` Mark Gross 2006-09-12 3:37 ` Greg KH 0 siblings, 1 reply; 5+ messages in thread From: Mark Gross @ 2006-09-12 0:17 UTC (permalink / raw) To: Pavel Machek Cc: kernel list, Eugeny S. Mints, Matthew Locke, Greg KH, Amit Kucheria, pm list, Preece Scott-PREECE, Igor Stoppa On Tue, Sep 12, 2006 at 12:56:17AM +0200, Pavel Machek wrote: > Hi! > > Just for the record... this goes out to the lkml. This discussion was > internal for way too long. (for interested lkml readers, I'm sure > linux-pm mailing list has public archive somewhere). > This was rude. > On Tue 2006-09-12 02:05:26, Eugeny S. Mints wrote: > > Pavel Machek wrote: > > >>>>- PowerOP is only one layer (towards the bottom) in a power management > > >>>>solution. > > >>>>- PowerOP does *not* replace cpufreq > > >>>PowerOP provides userland interface for changing processor > > >>>frequency. That's bad -- duplicate interface. > > >>Basically the biggest problem with cpufreq interface is that cpufreq has > > >>"chose > > >>predefined closest to a given frequency" functionality implemented in the > > >>kernel while there is _no_ any reason to have this functionality > > >>implemented in > > >>the kernel if we have sysfs interface exported by PowerOP in place - you > > >>just > > > > > >No, there is reason to keep that in kernel -- so that cpufreq > > >userspace interface can be kept, and so that resulting kernel<->user > > >interface is not ugly. > > Cpuferq defines cpufreq_frequency_table structure in arch independent > > header while it's arch dependent data structure. A lot of code is built > > around this invalid basic brick and therefore is invalid: cpufreq tables, > > interface with cpu freq drivers, etc. Notion of transition latency as it > > defined by cpufreq is wrong because it's not a function of cpu type but > > function of current and next operating point. no runtime control on > > operating points set. it's always the same set of operating points for all > > system cpus in smp case and no way to define different sets or track any > > dependencies in case say multi core cpus. insufficient kernel<->user space > > interface to handle embedded requirements and no way to extend it within > > current design. Shall I continue? Why should then anyone want to keep > > cpufreq userspace interface just due to keep it? > > Yes, please continue. I do not think we can just rip cpufreq interface > out of kernel, and I do not think it is as broken as you claim it > is. Ripping interface out of kernel takes years. > > I'm sure cpufreq_frequency_table could be moved to asm/ header if you > felt strongly about that. > > I believe we need to fix cpufreq if it is broken for embedded > cases. cpufreq is broken at the cpufreq_driver interface for embedded applications needing control over more than one control variable at a time. That interface only supports setting target frequencies, and expanding it to set target frequencies and voltages is not possible without something like PowerOP. Adding the types of parameters to cpufreq would likely make cpufreq a mess. I think we would be better off with something that coexists with cpufreq, like the powerop patch from Eugeny. God help you if you try to use cpufreq on a complex non-PC platform with multiple power and clock domains that need to be tweaked to squeeze out competitive battery life. Because of the existing user base of cpufreq removing cpufreq will never happen. No one supporting the PowerOP patch has never recommended such a thing. However; holding back innovation because of an existing solution that doesn't support a large class of users seems dumb. --mgross ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP] 2006-09-12 0:17 ` Mark Gross @ 2006-09-12 3:37 ` Greg KH 2006-09-13 23:50 ` [linux-pm] " David Singleton 0 siblings, 1 reply; 5+ messages in thread From: Greg KH @ 2006-09-12 3:37 UTC (permalink / raw) To: Mark Gross Cc: Pavel Machek, kernel list, Eugeny S. Mints, Matthew Locke, Amit Kucheria, pm list, Preece Scott-PREECE, Igor Stoppa On Mon, Sep 11, 2006 at 05:17:01PM -0700, Mark Gross wrote: > > cpufreq is broken at the cpufreq_driver interface for embedded > applications needing control over more than one control variable at a > time. > > That interface only supports setting target frequencies, and expanding it > to set target frequencies and voltages is not possible without something > like PowerOP. Adding the types of parameters to cpufreq would likely > make cpufreq a mess. I think we would be better off with something that > coexists with cpufreq, like the powerop patch from Eugeny. > > God help you if you try to use cpufreq on a complex non-PC platform with > multiple power and clock domains that need to be tweaked to squeeze out > competitive battery life. > > Because of the existing user base of cpufreq removing cpufreq will never > happen. No one supporting the PowerOP patch has never recommended > such a thing. However; holding back innovation because of an existing > solution that doesn't support a large class of users seems dumb. But you can't break the existing stuff, and it seems that some of these proposals are doing just that. :( thanks, greg k-h ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-pm] cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP] 2006-09-12 3:37 ` Greg KH @ 2006-09-13 23:50 ` David Singleton 2006-09-14 5:55 ` OpPoint summary Greg KH 0 siblings, 1 reply; 5+ messages in thread From: David Singleton @ 2006-09-13 23:50 UTC (permalink / raw) To: Greg KH; +Cc: linux-pm, kernel list Greg, here's a few paragraphs about the power management code I'm working on. The OpPoint patch set is a fully functionaly power management solution, from kernel operating state support to userland power manager. OpPoint constructs operating points for all supported frequency, voltage and suspend states for PC and SoC solutions running Linux. OpPoint does not break or replace cpufreq. It leverages cpufreq code to provide the same driver scaling functions when cpu frequency changes affect drivers. (The ARM pxa27x patch uses the cpufreq scaling routines to scale the LCD when frequencies are changed and works well when playing mpeg movies on the LCD during frequency scaling operations). The Operating Points in OpPoint are simply created at compile time, in the same manner cpufreq tables are, and registered in /sys/power/operating_states directory when the cpu is identified at boot time. The states are ordered by name on their power consumption level from lowest to highest so the power manager can operate correctly regardless of what frequencies or voltages are associated with the lowest or highest operating points. There is no kernel interface to parse and create all the parameters needed to create an operating point. Platform specific information is supplied to the oppoint structure through an ops vector of three routines and a void * pointer to supply the platform specific data, in the same manner drivers have a void * for their private data. The ops vectors provide operating point specific functions to prepare to change to a new operating point, transition to the target operating point, and a finish transition routine to either notify driver that the clocks have scaled and operation of bus and DMA traffic may continue. OpPoint draws the line about what's needed in the kernel a bit differently than Matt's PowerOp code. OpPoint only puts operating point support in the kernel. Polices for operting states and classes of operating states are left to the power manager, in userland. This simplifies the kernel code, no string parsers for operating point parameter construction, and makes it easier to customize a solution by customizing the power manager. A power manager is supplied with the OpPoint patch set as well. I borrowed the cpuspeed deamon and made a simple patch that uses the new OpPoint sysfs interface. The daemon can be compiled as the original cpuspeed or oppointd deamon depending on the users choice. The daemon provides the same functions as the cpuspeed daemon. OpPoint is a fully functional solution ready for testing and evaluation in Andrew's or your tree. The kernel patches are available at: http://source.mvista.com/~dsingleton/2.6.1-rc6 the power manager source code and patch is available at: http://source.mvista.com/~dsingleton/oppointd David ^ permalink raw reply [flat|nested] 5+ messages in thread
* OpPoint summary 2006-09-13 23:50 ` [linux-pm] " David Singleton @ 2006-09-14 5:55 ` Greg KH 2006-09-14 17:07 ` David Singleton 2006-09-17 5:07 ` David Singleton 0 siblings, 2 replies; 5+ messages in thread From: Greg KH @ 2006-09-14 5:55 UTC (permalink / raw) To: David Singleton; +Cc: linux-pm, kernel list On Wed, Sep 13, 2006 at 04:50:43PM -0700, David Singleton wrote: > Greg, > here's a few paragraphs about the power management code I'm working on. > The OpPoint patch set is a fully functionaly power management solution, > from kernel operating state support to userland power manager. Thanks for the summary, but it was a bit longer than just "one paragraph" :) > OpPoint constructs operating points for all supported frequency, voltage > and suspend states for PC and SoC solutions running Linux. OpPoint > does not break or replace cpufreq. It leverages cpufreq code to provide > the same driver scaling functions when cpu frequency changes affect drivers. So it works with cpufreq? That's a good thing, as it is a requirement. We can't just break current usages. > OpPoint is a fully functional solution ready for testing and evaluation > in Andrew's or your tree. > > The kernel patches are available at: > > http://source.mvista.com/~dsingleton/2.6.1-rc6 I get a 404 with that link :( Care to resend your patches in the proper format, through email so that we can see them, and possibly get some testing in -mm if they look sane? thanks, greg k-h ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: OpPoint summary 2006-09-14 5:55 ` OpPoint summary Greg KH @ 2006-09-14 17:07 ` David Singleton 2006-09-14 17:25 ` Auke Kok 2006-09-17 17:48 ` Pavel Machek 2006-09-17 5:07 ` David Singleton 1 sibling, 2 replies; 5+ messages in thread From: David Singleton @ 2006-09-14 17:07 UTC (permalink / raw) To: Greg KH; +Cc: linux-pm, kernel list > > Care to resend your patches in the proper format, through email so that > we can see them, and possibly get some testing in -mm if they look sane? Greg, here's the patch that implements operating points for different frequencies for the speedstep-centrino line of processors. Operating points are created in much the same manner that cpufreq tables are. This works for both simple implementations like the centrino and more complex SoC systems like the arm-pxa72x which has several clocks to control, and different clock divisors and multipliers. David Signed-Off-by: David Singleton <dsingleton@mvista.com> arch/i386/Kconfig | 2 arch/i386/kernel/cpu/Makefile | 1 arch/i386/kernel/cpu/power/Kconfig | 168 ++++++++++ arch/i386/kernel/cpu/power/Makefile | 2 arch/i386/kernel/cpu/power/centrino-on-the-fly.c | 72 ++++ arch/i386/kernel/cpu/power/centrino-speedstep.c | 368 +++++++++++++++++++++++ arch/i386/kernel/i386_ksyms.c | 4 7 files changed, 617 insertions(+) Index: linux-2.6.17/arch/i386/kernel/cpu/Makefile =================================================================== --- linux-2.6.17.orig/arch/i386/kernel/cpu/Makefile +++ linux-2.6.17/arch/i386/kernel/cpu/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_X86_MCE) += mcheck/ obj-$(CONFIG_MTRR) += mtrr/ obj-$(CONFIG_CPU_FREQ) += cpufreq/ +obj-$(CONFIG_PM) += power/ Index: linux-2.6.17/arch/i386/kernel/i386_ksyms.c =================================================================== --- linux-2.6.17.orig/arch/i386/kernel/i386_ksyms.c +++ linux-2.6.17/arch/i386/kernel/i386_ksyms.c @@ -28,3 +28,7 @@ EXPORT_SYMBOL(__read_lock_failed); #endif EXPORT_SYMBOL(csum_partial); +#ifdef CONFIG_PM +#include <linux/pm.h> +EXPORT_SYMBOL(pm_states); +#endif Index: linux-2.6.17/arch/i386/Kconfig =================================================================== --- linux-2.6.17.orig/arch/i386/Kconfig +++ linux-2.6.17/arch/i386/Kconfig @@ -964,6 +964,8 @@ config APM_REAL_MODE_POWER_OFF endmenu +source "arch/i386/kernel/cpu/power/Kconfig" + source "arch/i386/kernel/cpu/cpufreq/Kconfig" endmenu Index: linux-2.6.17/arch/i386/kernel/cpu/power/Makefile =================================================================== --- /dev/null +++ linux-2.6.17/arch/i386/kernel/cpu/power/Makefile @@ -0,0 +1,2 @@ +obj-m += centrino-on-the-fly.o +obj-$(CONFIG_X86_SPEEDSTEP_CENTRINO) += centrino-speedstep.o Index: linux-2.6.17/arch/i386/kernel/cpu/power/centrino-speedstep.c =================================================================== --- /dev/null +++ linux-2.6.17/arch/i386/kernel/cpu/power/centrino-speedstep.c @@ -0,0 +1,368 @@ +/* + * OpPoint support for Enhanced SpeedStep, as found in Intel's Pentium + * M (part of the Centrino chipset). + * + * Modelled on speedstep-centrino.c + * + * Author: David Singleton dsingleton@mvista.com MontaVista Software, Inc. + * + * 2006 (c) MontaVista Software, Inc. This file is licensed under + * the terms of the GNU General Public License version 2. This program + * is licensed "as is" without any warranty of any kind, whether express + * or implied. + */ + +#include <linux/module.h> +#include <linux/kernel.h> +#include <linux/init.h> +#include <linux/pm.h> +#include <linux/delay.h> +#include <linux/cpufreq.h> +#include <linux/moduleparam.h> +#include <linux/moduleloader.h> + +struct cpu_id +{ + __u8 x86; /* CPU family */ + __u8 x86_model; /* model */ + __u8 x86_mask; /* stepping */ +}; + +enum { + CPU_BANIAS, + CPU_DOTHAN_A1, + CPU_DOTHAN_A2, + CPU_DOTHAN_B0, + CPU_MP4HT_D0, + CPU_MP4HT_E0, +}; + +static const struct cpu_id cpu_ids[] = { + [CPU_BANIAS] = { 6, 9, 5 }, + [CPU_DOTHAN_A1] = { 6, 13, 1 }, + [CPU_DOTHAN_A2] = { 6, 13, 2 }, + [CPU_DOTHAN_B0] = { 6, 13, 6 }, + [CPU_MP4HT_D0] = {15, 3, 4 }, + [CPU_MP4HT_E0] = {15, 4, 1 }, +}; +#define N_IDS ARRAY_SIZE(cpu_ids) + +struct cpu_model +{ + const struct cpu_id *cpu_id; + const char *model_name; + unsigned max_freq; /* max clock in kHz */ + +}; +static int centrino_verify_cpu_id(const struct cpuinfo_x86 *c, const struct cpu_id *x); + +void centrino_set_frequency(struct oppoint *op, uint freq, uint volt) +{ + op->frequency = freq * 1000; + op->voltage = volt; + op->md_data = (void *)(((freq / 100) << 8) | (volt - 700) / 16); +} +EXPORT_SYMBOL(centrino_set_frequency); + +int centrino_transition(struct oppoint *cur, struct oppoint *new) +{ + unsigned int msr, oldmsr = 0, h = 0; + + if (cur == new) + return 0; + + msr = (unsigned int)new->md_data; + rdmsr(MSR_IA32_PERF_CTL, oldmsr, h); + + /* all but 16 LSB are reserved, treat them with care */ + oldmsr &= ~0xffff; + msr &= 0xffff; + oldmsr |= msr; + + wrmsr(MSR_IA32_PERF_CTL, oldmsr, h); + + udelay(new->latency); + + return 0; +} +EXPORT_SYMBOL(centrino_transition); + +#define _BANIAS(cpuid, max, name) \ +{ .cpu_id = cpuid, \ + .model_name = "Intel(R) Pentium(R) M processor " name "MHz", \ + .max_freq = (max)*1000, \ +} +#define BANIAS(max) _BANIAS(&cpu_ids[CPU_BANIAS], max, #max) + +/* + * CPU models, their operating frequency range, and freq/voltage + * operating points + */ +static struct cpu_model models[] = +{ + _BANIAS(&cpu_ids[CPU_BANIAS], 900, " 900"), + BANIAS(1000), + BANIAS(1100), + BANIAS(1200), + BANIAS(1300), + BANIAS(1400), + BANIAS(1500), + BANIAS(1600), + BANIAS(1700), + + /* NULL model_name is a wildcard */ + { &cpu_ids[CPU_DOTHAN_A1], NULL, 0}, + { &cpu_ids[CPU_DOTHAN_A2], NULL, 0}, + { &cpu_ids[CPU_DOTHAN_B0], NULL, 0}, + { &cpu_ids[CPU_MP4HT_D0], NULL, 0}, + { &cpu_ids[CPU_MP4HT_E0], NULL, 0}, + + { NULL, } +}; +#undef _BANIAS +#undef BANIAS + +static struct oppoint lowest = { + .name = "lowest", + .type = PM_FREQ_CHANGE, + .frequency = 0, + .voltage = 0, + .latency = 15, + .prepare_transition = cpufreq_prepare_transition, + .transition = centrino_transition, + .finish_transition = cpufreq_finish_transition, +}; + +static struct oppoint low = { + .name = "low", + .type = PM_FREQ_CHANGE, + .latency = 15, + .prepare_transition = cpufreq_prepare_transition, + .transition = centrino_transition, + .finish_transition = cpufreq_finish_transition, +}; + +static struct oppoint mediumlow = { + .name = "mediumlow", + .type = PM_FREQ_CHANGE, + .latency = 15, + .prepare_transition = cpufreq_prepare_transition, + .transition = centrino_transition, + .finish_transition = cpufreq_finish_transition, +}; + +static struct oppoint medium = { + .name = "medium", + .type = PM_FREQ_CHANGE, + .latency = 15, + .prepare_transition = cpufreq_prepare_transition, + .transition = centrino_transition, + .finish_transition = cpufreq_finish_transition, +}; + +static struct oppoint mediumhigh = { + .name = "mediumhigh", + .type = PM_FREQ_CHANGE, + .latency = 15, + .prepare_transition = cpufreq_prepare_transition, + .transition = centrino_transition, + .finish_transition = cpufreq_finish_transition, +}; + +static struct oppoint high = { + .name = "high", + .type = PM_FREQ_CHANGE, + .latency = 15, + .prepare_transition = cpufreq_prepare_transition, + .transition = centrino_transition, + .finish_transition = cpufreq_finish_transition, +}; + +static struct oppoint highest = { + .name = "highest", + .type = PM_FREQ_CHANGE, + .latency = 15, + .prepare_transition = cpufreq_prepare_transition, + .transition = centrino_transition, + .finish_transition = cpufreq_finish_transition, +}; + +static int __init centrino_init_oppoint(void) +{ + struct cpuinfo_x86 *cpu = &cpu_data[0]; + struct cpu_model *model; + + for(model = models; model->cpu_id != NULL; model++) { + if (centrino_verify_cpu_id(cpu, model->cpu_id) && + (model->model_name == NULL || + strcmp(cpu->x86_model_id, model->model_name) == 0)) + break; + } + + if (model->cpu_id == NULL) { + /* No match at all */ + printk("OpPoint: no support for CPU model %s\n", + cpu->x86_model_id); + return -ENOENT; + } + + switch (model->max_freq) { + /* Ultra Low Voltage Intel Pentium M processor 900MHz (Banias) */ + case (900000) : + { + centrino_set_frequency(&low, 600, 844); + centrino_set_frequency(&medium, 800, 988); + centrino_set_frequency(&high, 900, 1004); + break; + } + /* Ultra Low Voltage Intel Pentium M processor 1.00GHz (Banias) */ + case (1000000) : + { + centrino_set_frequency(&low, 600, 844); + centrino_set_frequency(&medium, 800, 972); + centrino_set_frequency(&high, 900, 988); + centrino_set_frequency(&highest, 1000, 1004); + break; + } + /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */ + case (1100000) : + { + centrino_set_frequency(&lowest, 600, 956); + centrino_set_frequency(&low, 800, 1020); + centrino_set_frequency(&medium, 900, 1100); + centrino_set_frequency(&high, 1000, 1164); + centrino_set_frequency(&highest, 1100, 1180); + break; + } + /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */ + case (1200000) : + { + centrino_set_frequency(&lowest, 600, 956); + centrino_set_frequency(&low, 800, 1004); + centrino_set_frequency(&medium, 900, 1020); + centrino_set_frequency(&mediumhigh, 1000, 1100); + centrino_set_frequency(&high, 1100, 1164); + centrino_set_frequency(&highest, 1200, 1180); + break; + } + /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */ + case (1300000) : + { + centrino_set_frequency(&lowest, 600, 956); + centrino_set_frequency(&low, 800, 1260); + centrino_set_frequency(&medium, 1000, 1292); + centrino_set_frequency(&high, 1200, 1356); + centrino_set_frequency(&highest, 1300, 1388); + break; + } + /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */ + case (1400000) : + { + centrino_set_frequency(&lowest, 600, 956); + centrino_set_frequency(&low, 800, 1180); + centrino_set_frequency(&medium, 1000, 1308); + centrino_set_frequency(&high, 1200, 1436); + centrino_set_frequency(&highest, 1400, 1484); + break; + } + /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */ + case (1500000) : + { + centrino_set_frequency(&lowest, 600, 956); + centrino_set_frequency(&low, 800, 1116); + centrino_set_frequency(&medium, 1000, 1228); + centrino_set_frequency(&mediumhigh, 1200, 1356); + centrino_set_frequency(&high, 1400, 1452); + centrino_set_frequency(&highest, 1500, 1484); + break; + } + /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */ + case (1600000) : + { + centrino_set_frequency(&lowest, 600, 956); + centrino_set_frequency(&low, 800, 1036); + centrino_set_frequency(&medium, 1000, 1164); + centrino_set_frequency(&mediumhigh, 1200, 1276); + centrino_set_frequency(&high, 1400, 1420); + centrino_set_frequency(&highest, 1600, 1484); + break; + } + /* Ultra Low Voltage Intel Pentium M processor 1.10GHz (Banias) */ + case (1700000) : + { + centrino_set_frequency(&lowest, 600, 956); + centrino_set_frequency(&low, 800, 1004); + centrino_set_frequency(&medium, 1000, 1116); + centrino_set_frequency(&mediumhigh, 1200, 1228); + centrino_set_frequency(&high, 1400, 1308); + centrino_set_frequency(&highest, 1700, 1484); + break; + } + } + if (lowest.frequency) { + register_operating_point(&lowest); + list_add_tail(&lowest.list, &pm_states.list); + } + if (low.frequency) { + register_operating_point(&low); + list_add_tail(&low.list, &pm_states.list); + } + if (mediumlow.frequency) { + register_operating_point(&mediumlow); + list_add_tail(&mediumlow.list, &pm_states.list); + } + if (medium.frequency) { + register_operating_point(&medium); + list_add_tail(&medium.list, &pm_states.list); + } + if (mediumhigh.frequency) { + register_operating_point(&mediumhigh); + list_add_tail(&mediumhigh.list, &pm_states.list); + } + if (high.frequency) { + register_operating_point(&high); + list_add_tail(&high.list, &pm_states.list); + current_state = &high; + } + if (highest.frequency) { + register_operating_point(&highest); + list_add_tail(&highest.list, &pm_states.list); + current_state = &highest; + } + return 0; +} + +static void centrino_exit_oppoint(void) +{ + if (lowest.frequency) + list_del_init(&lowest.list); + if (low.frequency) + list_del_init(&low.list); + if (mediumlow.frequency) + list_del_init(&mediumlow.list); + if (medium.frequency) + list_del_init(&medium.list); + if (mediumhigh.frequency) + list_del_init(&mediumhigh.list); + if (high.frequency) + list_del_init(&high.list); + if (highest.frequency) + list_del_init(&highest.list); + return; +} + +static int centrino_verify_cpu_id(const struct cpuinfo_x86 *c, const struct cpu_id *x) +{ + if ((c->x86 == x->x86) && + (c->x86_model == x->x86_model) && + (c->x86_mask == x->x86_mask)) + return 1; + return 0; +} + +MODULE_AUTHOR ("David Singleton <dsingleton@mvista.com>"); +MODULE_DESCRIPTION ("OpPoint operting points for Intel Pentium M processors."); +MODULE_LICENSE ("GPL"); + +late_initcall(centrino_init_oppoint); +module_exit(centrino_exit_oppoint); Index: linux-2.6.17/arch/i386/kernel/cpu/power/Kconfig =================================================================== --- /dev/null +++ linux-2.6.17/arch/i386/kernel/cpu/power/Kconfig @@ -0,0 +1,168 @@ +# +# Operating Point support for frequency/voltage scaling +# + +menu "CPU Frequency/Voltage scaling" + +if CPU_PM + +comment "OpPoint processor support" + +config ELAN_OPPOINT + tristate "AMD Elan SC400 and SC410" + depends on X86_ELAN + ---help--- + This adds the OpPoint support for AMD Elan SC400 and SC410 + processors. + + You need to specify the processor maximum speed as boot + parameter: elanfreq=maxspeed (in kHz) or as module + parameter "max_freq". + + If in doubt, say N. + +config SC520_OPPOINT + tristate "AMD Elan SC520" + depends on X86_ELAN + ---help--- + This adds OpPoint support for AMD Elan SC520 processor. + + If in doubt, say N. + + +config X86_POWERNOW_K6 + tristate "AMD Mobile K6-2/K6-3 PowerNow!" + help + This adds OpPoint support for mobile AMD K6-2+ and mobile + AMD K6-3+ processors. + + If in doubt, say N. + +config X86_POWERNOW_K7 + tristate "AMD Mobile Athlon/Duron PowerNow!" + help + This adds OpPoint support for mobile AMD K7 mobile processors. + + If in doubt, say N. + +config X86_POWERNOW_K7_ACPI + bool + depends on X86_POWERNOW_K7 && ACPI_PROCESSOR + depends on !(X86_POWERNOW_K7 = y && ACPI_PROCESSOR = m) + default y + +config X86_POWERNOW_K8 + tristate "AMD Opteron/Athlon64 PowerNow!" + depends on EXPERIMENTAL + help + This adds OpPoint support for mobile AMD Opteron/Athlon64 processors. + + If in doubt, say N. + +config X86_POWERNOW_K8_ACPI + bool + depends on X86_POWERNOW_K8 && ACPI_PROCESSOR + depends on !(X86_POWERNOW_K8 = y && ACPI_PROCESSOR = m) + default y + +config X86_GX_SUSPMOD + tristate "Cyrix MediaGX/NatSemi Geode Suspend Modulation" + depends on PCI + help + This add OpPoint support for NatSemi Geode processors which + support suspend modulation. + + If in doubt, say N. + +config X86_SPEEDSTEP_CENTRINO + tristate "Intel Enhanced SpeedStep" + select X86_SPEEDSTEP_CENTRINO_TABLE if (!X86_SPEEDSTEP_CENTRINO_ACPI) + help + This adds OpPoint support for Enhanced SpeedStep enabled + mobile CPUs. This means Intel Pentium M (Centrino) CPUs. However, + you also need to say Y to "Use ACPI tables to decode..." below + [which might imply enabling ACPI] if you want to use this driver + on non-Banias CPUs. + + If in doubt, say N. + +config X86_SPEEDSTEP_CENTRINO_ACPI + bool "Use ACPI tables to decode valid frequency/voltage pairs" + depends on X86_SPEEDSTEP_CENTRINO && ACPI_PROCESSOR + depends on !(X86_SPEEDSTEP_CENTRINO = y && ACPI_PROCESSOR = m) + default y + help + Use primarily the information provided in the BIOS ACPI tables + to determine valid CPU frequency and voltage pairings. It is + required for the driver to work on non-Banias CPUs. + + If in doubt, say Y. + +config X86_SPEEDSTEP_CENTRINO_TABLE + bool "Built-in tables for Banias CPUs" + depends on X86_SPEEDSTEP_CENTRINO + default y + help + Use built-in tables for Banias CPUs if ACPI encoding + is not available. + + If in doubt, say N. + +config X86_SPEEDSTEP_ICH + tristate "Intel Speedstep on ICH-M chipsets (ioport interface)" + help + This adds the OpPoint support for certain mobile Intel Pentium III + (Coppermine), all mobile Intel Pentium III-M (Tualatin) and all + mobile Intel Pentium 4 P4-M on systems which have an Intel ICH2, + ICH3 or ICH4 southbridge. + + If in doubt, say N. + +config X86_SPEEDSTEP_SMI + tristate "Intel SpeedStep on 440BX/ZX/MX chipsets (SMI interface)" + depends on EXPERIMENTAL + help + This adds OpPoint support for certain mobile Intel Pentium III + (Coppermine), all mobile Intel Pentium III-M (Tualatin) + on systems which have an Intel 440BX/ZX/MX southbridge. + + If in doubt, say N. + +config X86_P4_CLOCKMOD + tristate "Intel Pentium 4 clock modulation" + help + This adds OpPoint support for Intel Pentium 4 / XEON + processors. + + If in doubt, say N. + +config X86_OPPOINT_NFORCE2 + tristate "nVidia nForce2 FSB changing" + depends on EXPERIMENTAL + help + This adds OpPoint support for FSB changing on nVidia nForce2 + platforms. + + If in doubt, say N. + +config X86_LONGRUN + tristate "Transmeta LongRun" + help + This adds OpPoint support for Transmeta Crusoe and Efficeon processors + which support LongRun. + + If in doubt, say N. + +config X86_LONGHAUL + tristate "VIA Cyrix III Longhaul" + depends on ACPI_PROCESSOR + help + This adds OpPoint support for VIA Samuel/CyrixIII, + VIA Cyrix Samuel/C3, VIA Cyrix Ezra and VIA Cyrix Ezra-T + processors. + + If in doubt, say N. + +endif # CONFIG_PM + +endmenu Index: linux-2.6.17/arch/i386/kernel/cpu/power/centrino-on-the-fly.c =================================================================== --- /dev/null +++ linux-2.6.17/arch/i386/kernel/cpu/power/centrino-on-the-fly.c @@ -0,0 +1,72 @@ +/* + * power/centrino-on-the-fly.c + * + * This is the template to create a dynamic operating point. + * + * Author: David Singleton dsingleton@mvista.com MontaVista Software, Inc. + * + * 2006 (c) MontaVista Software, Inc. This file is licensed under + * the terms of the GNU General Public License version 2. This program + * is licensed "as is" without any warranty of any kind, whether express + * or implied. + */ + +#include <linux/module.h> +#include <linux/init.h> +#include <linux/list.h> +#include <linux/pm.h> +#include <linux/cpufreq.h> +#include <linux/moduleparam.h> +#include <linux/moduleloader.h> + +int centrino_transition(struct oppoint *cur, struct oppoint *new); + +static unsigned int frequency = 1000; +static unsigned int voltage = 1308; +static unsigned int latency = 100; +module_param_named(frequency, frequency, uint, 0); +module_param_named(voltage, voltage, uint, 0); +module_param_named(latency, latency, uint, 0); +MODULE_PARM_DESC(frequency, "cpu frequency in kHz"); +MODULE_PARM_DESC(voltage, "cpu voltage in mV"); +MODULE_PARM_DESC(latency, "transition latency in us"); + +/* Register both the driver and the device */ + +static struct oppoint dynamic_op = { + .type = PM_FREQ_CHANGE, + .name = "dynamic", + .prepare_transition = cpufreq_prepare_transition, + .transition = centrino_transition, + .finish_transition = cpufreq_finish_transition, +}; + +extern void centrino_set_frequency(struct oppoint *op, uint freq, uint volt); + +int __init dynamic_oppoint_init(void) +{ + + printk("Dynamic OpPoint operating point for speedstep centrino\n"); + dynamic_op.frequency = frequency; + dynamic_op.voltage = voltage; + dynamic_op.latency = latency; + centrino_set_frequency(&dynamic_op, frequency / 1000, voltage); + register_operating_point(&dynamic_op); + printk("OpPoint: freq %d volt %d msr 0x%x\n", dynamic_op.frequency, + dynamic_op.voltage, (unsigned int)dynamic_op.md_data); + list_add_tail(&dynamic_op.list, &pm_states.list); + return 0; +} + +void __exit dynamic_oppoint_cleanup(void) +{ + unregister_operating_point(&dynamic_op); + list_del_init(&dynamic_op.list); +} + +module_init(dynamic_oppoint_init); +module_exit(dynamic_oppoint_cleanup); + +MODULE_AUTHOR("David Singleton <dsingleton@mvista.com>"); +MODULE_DESCRIPTION("Dynamic OpPoint for Intel Pentium M processor module"); +MODULE_LICENSE("GPL"); > > thanks, > > greg k-h > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: OpPoint summary 2006-09-14 17:07 ` David Singleton @ 2006-09-14 17:25 ` Auke Kok 2006-09-14 18:15 ` [linux-pm] " Vitaly Wool 2006-09-17 17:48 ` Pavel Machek 1 sibling, 1 reply; 5+ messages in thread From: Auke Kok @ 2006-09-14 17:25 UTC (permalink / raw) To: David Singleton; +Cc: linux-pm, kernel list David Singleton wrote: > +static const struct cpu_id cpu_ids[] = { > + [CPU_BANIAS] = { 6, 9, 5 }, > + [CPU_DOTHAN_A1] = { 6, 13, 1 }, > + [CPU_DOTHAN_A2] = { 6, 13, 2 }, > + [CPU_DOTHAN_B0] = { 6, 13, 6 }, > + [CPU_MP4HT_D0] = {15, 3, 4 }, > + [CPU_MP4HT_E0] = {15, 4, 1 }, > +}; Any reason why { 6, 13, 8 } is missing? My lenovo T43 identifies itself as such: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 1.86GHz stepping : 8 I'm not sure a Dothan B1 exists, but some postings suggest even C0 and C1 are valid steppings. I'm sure OpPoint could work with those as well. Auke ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-pm] OpPoint summary 2006-09-14 17:25 ` Auke Kok @ 2006-09-14 18:15 ` Vitaly Wool 0 siblings, 0 replies; 5+ messages in thread From: Vitaly Wool @ 2006-09-14 18:15 UTC (permalink / raw) To: Auke Kok; +Cc: David Singleton, linux-pm, kernel list On 9/14/06, Auke Kok <auke-jan.h.kok@intel.com> wrote: > David Singleton wrote: > > > +static const struct cpu_id cpu_ids[] = { > > + [CPU_BANIAS] = { 6, 9, 5 }, > > + [CPU_DOTHAN_A1] = { 6, 13, 1 }, > > + [CPU_DOTHAN_A2] = { 6, 13, 2 }, > > + [CPU_DOTHAN_B0] = { 6, 13, 6 }, > > + [CPU_MP4HT_D0] = {15, 3, 4 }, > > + [CPU_MP4HT_E0] = {15, 4, 1 }, > > +}; > > > Any reason why { 6, 13, 8 } is missing? My lenovo T43 identifies itself as such: > > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 13 > model name : Intel(R) Pentium(R) M processor 1.86GHz > stepping : 8 > > I'm not sure a Dothan B1 exists, but some postings suggest even C0 and C1 are > valid steppings. I'm sure OpPoint could work with those as well. Heh, that shows pretty much that the approach itself is not good... And this is only beginning. Vitaly ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: OpPoint summary 2006-09-14 17:07 ` David Singleton 2006-09-14 17:25 ` Auke Kok @ 2006-09-17 17:48 ` Pavel Machek 2006-09-18 14:33 ` [linux-pm] " Richard A. Griffiths 1 sibling, 1 reply; 5+ messages in thread From: Pavel Machek @ 2006-09-17 17:48 UTC (permalink / raw) To: David Singleton; +Cc: Greg KH, linux-pm, kernel list Hi! > >Care to resend your patches in the proper format, through email so that > >we can see them, and possibly get some testing in -mm if they look sane? > > Greg, > here's the patch that implements operating points for different > frequencies > for the speedstep-centrino line of processors. Operating points are created > in much the same manner that cpufreq tables are. This works for both > simple implementations like the centrino and more complex SoC systems > like the arm-pxa72x which has several clocks to control, and different clock > divisors and multipliers. > +static struct oppoint lowest = { > + .name = "lowest", > + .type = PM_FREQ_CHANGE, > + .frequency = 0, > + .voltage = 0, > + .latency = 15, > + .prepare_transition = cpufreq_prepare_transition, > + .transition = centrino_transition, > + .finish_transition = cpufreq_finish_transition, > +}; We had nice, descriptive interface... with numbers. Now you want to introduce english state names... looks like a step back to me. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-pm] OpPoint summary 2006-09-17 17:48 ` Pavel Machek @ 2006-09-18 14:33 ` Richard A. Griffiths 0 siblings, 0 replies; 5+ messages in thread From: Richard A. Griffiths @ 2006-09-18 14:33 UTC (permalink / raw) To: Pavel Machek; +Cc: David Singleton, linux-pm, kernel list On Sun, 2006-09-17 at 19:48 +0200, Pavel Machek wrote: > Hi! > > > >Care to resend your patches in the proper format, through email so that > > >we can see them, and possibly get some testing in -mm if they look sane? > > > > Greg, > > here's the patch that implements operating points for different > > frequencies > > for the speedstep-centrino line of processors. Operating points are created > > in much the same manner that cpufreq tables are. This works for both > > simple implementations like the centrino and more complex SoC systems > > like the arm-pxa72x which has several clocks to control, and different clock > > divisors and multipliers. > > > +static struct oppoint lowest = { > > + .name = "lowest", > > + .type = PM_FREQ_CHANGE, > > + .frequency = 0, > > + .voltage = 0, > > + .latency = 15, > > + .prepare_transition = cpufreq_prepare_transition, > > + .transition = centrino_transition, > > + .finish_transition = cpufreq_finish_transition, > > +}; > > We had nice, descriptive interface... with numbers. Now you want to > introduce english state names... looks like a step back to me. Maybe a compromise could be reached where a defined set of numbers maps to string names ala Unix init states. Many people (at least me) still invoke init 6 to reboot a system. A defined table would satisfy both the number and string camps. Richard ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: OpPoint summary 2006-09-14 5:55 ` OpPoint summary Greg KH 2006-09-14 17:07 ` David Singleton @ 2006-09-17 5:07 ` David Singleton 2006-09-17 22:43 ` [linux-pm] " Matthew Locke 1 sibling, 1 reply; 5+ messages in thread From: David Singleton @ 2006-09-17 5:07 UTC (permalink / raw) To: Greg KH, pavel; +Cc: linux-pm, kernel list Pavel and Greg, I've incorporated Pavels suggestions and only put suspend states in the /sys/power/state file. The control file for frequency and voltage operating point transitions is now in /sys/power/operating_points/current_point. The /sys/power/operating_points dirctory still contains the operating points themselves, with a frequency, voltage and latency file for each operating point. The oppointd power manager has been changed to use the new control file for operating points. It has been tested on a centrino laptop, the 4 way Xeon server and the arm-pxa27x. I finally got SMP tested on a 4 way Xeon server. The patch that supports SMP Xeon's is the oppoint-x86-p4.patch in the series. The only files in the core framework patch now are: kernel/power/main.c include/linux/pm.h kernel/power/power.h The full patch set is at http://source.mvista.com/~dsingleton/2.6.18-rc7 The power manager source and patch is at: http://source.mvista.com/~dsingleton/oppointd-1.2.3 Attached is the oppoint-core.patch. David Signed-Off-by: David Singleton <dsingleton@mvista.com> include/linux/pm.h | 30 +++- kernel/power/main.c | 361 +++++++++++++++++++++++++++++++++++++++++++++------ kernel/power/power.h | 2 3 files changed, 350 insertions(+), 43 deletions(-) Index: linux-2.6.17/kernel/power/main.c =================================================================== --- linux-2.6.17.orig/kernel/power/main.c +++ linux-2.6.17/kernel/power/main.c @@ -16,6 +16,7 @@ #include <linux/init.h> #include <linux/pm.h> #include <linux/console.h> +#include <linux/module.h> #include "power.h" @@ -49,7 +50,7 @@ void pm_set_ops(struct pm_ops * ops) * the platform can enter the requested state. */ -static int suspend_prepare(suspend_state_t state) +static int suspend_prepare(struct oppoint * state) { int error = 0; unsigned int free_pages; @@ -82,7 +83,7 @@ static int suspend_prepare(suspend_state } if (pm_ops->prepare) { - if ((error = pm_ops->prepare(state))) + if ((error = pm_ops->prepare(state->type))) goto Thaw; } @@ -94,7 +95,7 @@ static int suspend_prepare(suspend_state return 0; Finish: if (pm_ops->finish) - pm_ops->finish(state); + pm_ops->finish(state->type); Thaw: thaw_processes(); Enable_cpu: @@ -104,7 +105,7 @@ static int suspend_prepare(suspend_state } -int suspend_enter(suspend_state_t state) +int suspend_enter(struct oppoint * state) { int error = 0; unsigned long flags; @@ -115,7 +116,7 @@ int suspend_enter(suspend_state_t state) printk(KERN_ERR "Some devices failed to power down\n"); goto Done; } - error = pm_ops->enter(state); + error = pm_ops->enter(state->type); device_power_up(); Done: local_irq_restore(flags); @@ -131,36 +132,82 @@ int suspend_enter(suspend_state_t state) * console that we've allocated. This is not called for suspend-to-disk. */ -static void suspend_finish(suspend_state_t state) +static void suspend_finish(struct oppoint * state) { device_resume(); resume_console(); thaw_processes(); enable_nonboot_cpus(); if (pm_ops && pm_ops->finish) - pm_ops->finish(state); + pm_ops->finish(state->type); pm_restore_console(); } +struct list_head pm_list; +static struct oppoint standby = { + .name = "standby", + .type = PM_SUSPEND_STANDBY, +}; +static struct oppoint mem = { + .name = "mem", + .type = PM_SUSPEND_MEM, + .frequency = 0, + .voltage = 0, + .latency = 150, +}; - -static const char * const pm_states[PM_SUSPEND_MAX] = { - [PM_SUSPEND_STANDBY] = "standby", - [PM_SUSPEND_MEM] = "mem", #ifdef CONFIG_SOFTWARE_SUSPEND - [PM_SUSPEND_DISK] = "disk", +struct oppoint disk = { + .name = "disk", + .type = PM_SUSPEND_DISK, +}; #endif + +struct oppoint pm_states = { + .name = "default", + .type = PM_FREQ_CHANGE, }; +struct oppoint *current_state; + +/* + * + */ +static int pm_change_state(struct oppoint *state) +{ + int error = 0; + + printk("OpPoint: changing from %s to %s\n", current_state->name, + state->name); + /* + * compare to current operating point. + * if different change to new operating point. + */ + if (current_state == state) + goto out; + + if ((error = state->prepare_transition(current_state, state))) + goto out; + + if ((error = state->transition(current_state, state))) + state = current_state; + + if ((error = state->finish_transition(current_state, state)) == 0) + current_state = state; + +out: + printk("OpPoint: State change returned %d\n", error); + return error; +} -static inline int valid_state(suspend_state_t state) +static inline int valid_state(struct oppoint * state) { /* Suspend-to-disk does not really need low-level support. * It can work with reboot if needed. */ - if (state == PM_SUSPEND_DISK) + if (state->type == PM_SUSPEND_DISK) return 1; - if (pm_ops && pm_ops->valid && !pm_ops->valid(state)) + if (pm_ops && pm_ops->valid && !pm_ops->valid(state->type)) return 0; return 1; } @@ -168,7 +215,7 @@ static inline int valid_state(suspend_st /** * enter_state - Do common work of entering low-power state. - * @state: pm_state structure for state we're entering. + * @state: oppoint structure for state we're entering. * * Make sure we're the only ones trying to enter a sleep state. Fail * if someone has beat us to it, since we don't want anything weird to @@ -177,7 +224,7 @@ static inline int valid_state(suspend_st * we've woken up). */ -static int enter_state(suspend_state_t state) +static int enter_state(struct oppoint *state) { int error; @@ -186,16 +233,21 @@ static int enter_state(suspend_state_t s if (down_trylock(&pm_sem)) return -EBUSY; - if (state == PM_SUSPEND_DISK) { + if (state->type == PM_SUSPEND_DISK) { error = pm_suspend_disk(); goto Unlock; } - pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]); + if (state->type == PM_FREQ_CHANGE || state->type == PM_VOLT_CHANGE) { + error = pm_change_state(state); + goto Unlock; + } + + pr_debug("PM: Preparing system for %s sleep\n", state->name); if ((error = suspend_prepare(state))) goto Unlock; - pr_debug("PM: Entering %s sleep\n", pm_states[state]); + pr_debug("PM: Entering %s sleep\n", state->name); error = suspend_enter(state); pr_debug("PM: Finishing wakeup.\n"); @@ -211,7 +263,15 @@ static int enter_state(suspend_state_t s */ int software_suspend(void) { - return enter_state(PM_SUSPEND_DISK); + struct oppoint *this, *next; + struct list_head *head = &mem.list; + int error = 0; + + list_for_each_entry_safe(this, next, head, list) { + if (this->type == PM_SUSPEND_DISK) + error= enter_state(this); + } + return error; } @@ -223,9 +283,9 @@ int software_suspend(void) * structure, and enter (above). */ -int pm_suspend(suspend_state_t state) +int pm_suspend(struct oppoint * state) { - if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX) + if (state->type > PM_SUSPEND_ON && state->type <= PM_SUSPEND_MAX) return enter_state(state); return -EINVAL; } @@ -248,36 +308,35 @@ decl_subsys(power,NULL,NULL); static ssize_t state_show(struct subsystem * subsys, char * buf) { - int i; - char * s = buf; + struct oppoint *this, *next; + struct list_head *head = &pm_list; + char *s = buf; + + list_for_each_entry_safe(this, next, head, list) + s += sprintf(s,"%s ", this->name); - for (i = 0; i < PM_SUSPEND_MAX; i++) { - if (pm_states[i] && valid_state(i)) - s += sprintf(s,"%s ", pm_states[i]); - } s += sprintf(s,"\n"); + return (s - buf); } static ssize_t state_store(struct subsystem * subsys, const char * buf, size_t n) { - suspend_state_t state = PM_SUSPEND_STANDBY; - const char * const *s; + struct oppoint *this, *next; + struct list_head *head = &mem.list; char *p; - int error; + int error = -EINVAL; int len; p = memchr(buf, '\n', n); len = p ? p - buf : n; - - for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) { - if (*s && !strncmp(buf, *s, len)) + list_for_each_entry_safe(this, next, head, list) { + if ((strlen(this->name) == len) && + (!strncmp(this->name, buf, len))) { + error = enter_state(this); break; + } } - if (state < PM_SUSPEND_MAX && *s) - error = enter_state(state); - else - error = -EINVAL; return error ? error : n; } @@ -292,12 +351,234 @@ static struct attribute_group attr_group .attrs = g, }; +static struct kobject oppoint_kobj = { + .kset = &power_subsys.kset, +}; + +struct oppoint_attribute { + struct attribute attr; + ssize_t (*show)(struct kobject * kobj, char * buf); + ssize_t (*store)(struct kobject * kobj, const char * buf, size_t count); +}; + +#define to_oppoint(obj) container_of(obj,struct oppoint,kobj) +#define to_oppoint_attr(_attr) container_of(_attr,struct oppoint_attribute,attr) +/* + * the frequency, voltage and latency files are readonly + */ + +static ssize_t oppoint_voltage_show(struct kobject * kobj, char * buf) +{ + ssize_t len; + struct oppoint *opt = to_oppoint(kobj); + + len = sprintf(buf, "%8d\n", opt->voltage); + + return len; +} + +static ssize_t oppoint_voltage_store(struct kobject * kobj, const char * buf, + size_t n) +{ + return -EINVAL; + +} + +static ssize_t oppoint_frequency_show(struct kobject * kobj, char * buf) +{ + ssize_t len; + struct oppoint *opt = to_oppoint(kobj); + + len = sprintf(buf, "%8d\n", opt->frequency); + + return len; +} + +static ssize_t oppoint_frequency_store(struct kobject * kobj, + const char * buf, size_t n) +{ + return -EINVAL; + +} + +static ssize_t oppoint_point_show(struct kobject * kobj, char * buf) +{ + ssize_t len; + + len = sprintf(buf, "%s\n", current_state->name); + + return len; +} + +static ssize_t oppoint_point_store(struct kobject * kobj, const char * buf, + size_t n) +{ + struct oppoint *this, *next; + struct list_head *head = &pm_states.list; + char *p; + int error = -EINVAL; + int len; + + p = memchr(buf, '\n', n); + len = p ? p - buf : n; + list_for_each_entry_safe(this, next, head, list) { + if ((strlen(this->name) == len) && + (!strncmp(this->name, buf, len))) { + error = enter_state(this); + break; + } + } + return error ? error : n; +} + +static ssize_t oppoint_latency_show(struct kobject * kobj, char * buf) +{ + ssize_t len; + struct oppoint *opt = to_oppoint(kobj); + + len = sprintf(buf, "%8d\n", opt->latency); + + return len; +} + +static ssize_t oppoint_latency_store(struct kobject * kobj, + const char * buf, size_t n) +{ + return -EINVAL; + +} + +static struct oppoint_attribute point_attr = { + .attr = { + .name = "current_point", + .mode = 0600, + }, + .show = oppoint_point_show, + .store = oppoint_point_store, +}; + +static struct oppoint_attribute frequency_attr = { + .attr = { + .name = "frequency", + .mode = 0400, + }, + .show = oppoint_frequency_show, + .store = oppoint_frequency_store, +}; + +static struct oppoint_attribute voltage_attr = { + .attr = { + .name = "voltage", + .mode = 0400, + }, + .show = oppoint_voltage_show, + .store = oppoint_voltage_store, +}; + +static struct oppoint_attribute latency_attr = { + .attr = { + .name = "latency", + .mode = 0400, + }, + .show = oppoint_latency_show, + .store = oppoint_latency_store, +}; + +static ssize_t +oppoint_attr_show(struct kobject * kobj, struct attribute * attr, char * buf) +{ + struct oppoint_attribute * opt_attr = to_oppoint_attr(attr); + ssize_t ret = 0; + + if (opt_attr->show) + ret = opt_attr->show(kobj,buf); + return ret; +} + +static ssize_t +oppoint_attr_store(struct kobject * kobj, struct attribute * attr, + const char * buf, size_t count) +{ + return -EINVAL; +} + +static void oppoint_kobj_release(struct kobject *kobj) +{ + return; +} + +static struct sysfs_ops oppoint_sysfs_ops = { + .show = oppoint_attr_show, + .store = oppoint_attr_store, +}; + +static struct attribute * oppoint_default_attrs[] = { + &frequency_attr.attr, + &voltage_attr.attr, + &latency_attr.attr, + NULL, +}; + +static struct kobj_type ktype_operating_point = { + .release = oppoint_kobj_release, + .sysfs_ops = &oppoint_sysfs_ops, + .default_attrs = oppoint_default_attrs, +}; + +int unregister_operating_point(struct oppoint *opt) +{ + down(&pm_sem); + list_del_init(&opt->list); + sysfs_remove_file(&opt->kobj, &frequency_attr.attr); + sysfs_remove_file(&opt->kobj, &voltage_attr.attr); + sysfs_remove_file(&opt->kobj, &latency_attr.attr); + up(&pm_sem); + + return 0; +} +EXPORT_SYMBOL(unregister_operating_point); + +int register_operating_point(struct oppoint *opt) +{ + down(&pm_sem); + kobject_set_name(&opt->kobj, opt->name); + opt->kobj.kset = &power_subsys.kset; + opt->kobj.parent = &oppoint_kobj; + opt->kobj.ktype = &ktype_operating_point; + kobject_register(&opt->kobj); + + sysfs_create_file(&opt->kobj, &frequency_attr.attr); + sysfs_create_file(&opt->kobj, &voltage_attr.attr); + sysfs_create_file(&opt->kobj, &latency_attr.attr); + + list_add_tail(&opt->list, &pm_states.list); + up(&pm_sem); + return 0; +} +EXPORT_SYMBOL(register_operating_point); static int __init pm_init(void) { + int error = subsystem_register(&power_subsys); - if (!error) + if (!error) { error = sysfs_create_group(&power_subsys.kset.kobj,&attr_group); + kobject_set_name(&oppoint_kobj, "operating_points"); + kobject_register(&oppoint_kobj); + sysfs_create_file(&oppoint_kobj, &point_attr.attr); + } + + + INIT_LIST_HEAD(&pm_states.list); + INIT_LIST_HEAD(&pm_list); + +#ifdef CONFIG_SOFTWARE_SUSPEND + list_add(&disk.list, &pm_list); +#endif + list_add(&standby.list, &pm_list); + list_add(&mem.list, &pm_list); + current_state = &pm_states; + return error; } Index: linux-2.6.17/include/linux/pm.h =================================================================== --- linux-2.6.17.orig/include/linux/pm.h +++ linux-2.6.17/include/linux/pm.h @@ -24,6 +24,7 @@ #ifdef __KERNEL__ #include <linux/list.h> +#include <linux/kobject.h> #include <asm/atomic.h> /* @@ -108,7 +109,32 @@ typedef int __bitwise suspend_state_t; #define PM_SUSPEND_STANDBY ((__force suspend_state_t) 1) #define PM_SUSPEND_MEM ((__force suspend_state_t) 3) #define PM_SUSPEND_DISK ((__force suspend_state_t) 4) -#define PM_SUSPEND_MAX ((__force suspend_state_t) 5) +#define PM_FREQ_CHANGE ((__force suspend_state_t) 5) +#define PM_VOLT_CHANGE ((__force suspend_state_t) 6) +#define PM_SUSPEND_MAX ((__force suspend_state_t) 7) + +struct oppoint { + struct list_head list; + suspend_state_t type; + char *name; + unsigned int flags; + unsigned int frequency; /* in KHz */ + unsigned int voltage; /* mV */ + unsigned int latency; /* transition latency in us */ + int (*prepare_transition)(struct oppoint *cur, struct oppoint *new); + int (*transition)(struct oppoint *cur, struct oppoint *new); + int (*finish_transition)(struct oppoint *cur, struct oppoint *new); + + void *md_data; /* arch dependent data */ + struct kobject kobj; +}; + + +extern struct oppoint pm_states; +extern struct oppoint *current_state; +extern int register_operating_point(struct oppoint *opt); +extern int unregister_operating_point(struct oppoint *opt); +struct notifier_block; typedef int __bitwise suspend_disk_method_t; @@ -128,7 +154,7 @@ struct pm_ops { extern void pm_set_ops(struct pm_ops *); extern struct pm_ops *pm_ops; -extern int pm_suspend(suspend_state_t state); +extern int pm_suspend(struct oppoint *state); /* Index: linux-2.6.17/kernel/power/power.h =================================================================== --- linux-2.6.17.orig/kernel/power/power.h +++ linux-2.6.17/kernel/power/power.h @@ -113,4 +113,4 @@ extern int swsusp_resume(void); extern int swsusp_read(void); extern int swsusp_write(void); extern void swsusp_close(void); -extern int suspend_enter(suspend_state_t state); +extern int suspend_enter(struct oppoint * state); ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [linux-pm] OpPoint summary 2006-09-17 5:07 ` David Singleton @ 2006-09-17 22:43 ` Matthew Locke 0 siblings, 0 replies; 5+ messages in thread From: Matthew Locke @ 2006-09-17 22:43 UTC (permalink / raw) To: David Singleton; +Cc: kernel list, linux-pm, Greg KH, pavel Dave, I am confused as to why you continue to duplicate work already done in PowerOP. We are still waiting to hear why you need to have a different operating point interface. As Eugeny and I have pointed out before (refer to the thread in my PowerOP vs OPpoint email), your interface doesn't make sense on a mulit cpu, SMP or otherwise, system. I recommend you redo your patches to use PowerOP which already supports this properly. On Sep 16, 2006, at 10:07 PM, David Singleton wrote: > Pavel and Greg, > > I've incorporated Pavels suggestions and only put suspend states > in the /sys/power/state file. The control file for frequency and > voltage operating > point transitions is now in /sys/power/operating_points/current_point. > > The /sys/power/operating_points dirctory still contains the operating > points themselves, with a frequency, voltage and latency file > for each operating point. > > The oppointd power manager has been changed to use the > new control file for operating points. It has been tested on > a centrino laptop, the 4 way Xeon server and the arm-pxa27x. > > I finally got SMP tested on a 4 way Xeon server. The patch > that supports SMP Xeon's is the oppoint-x86-p4.patch in the series. > > The only files in the core framework patch now are: > > kernel/power/main.c > include/linux/pm.h > kernel/power/power.h > > The full patch set is at > > http://source.mvista.com/~dsingleton/2.6.18-rc7 > > The power manager source and patch is at: > > http://source.mvista.com/~dsingleton/oppointd-1.2.3 > > Attached is the oppoint-core.patch. > > David > > Signed-Off-by: David Singleton <dsingleton@mvista.com> > > include/linux/pm.h | 30 +++- > kernel/power/main.c | 361 > +++++++++++++++++++++++++++++++++++++++++++++------ > kernel/power/power.h | 2 > 3 files changed, 350 insertions(+), 43 deletions(-) > > Index: linux-2.6.17/kernel/power/main.c > =================================================================== > --- linux-2.6.17.orig/kernel/power/main.c > +++ linux-2.6.17/kernel/power/main.c > @@ -16,6 +16,7 @@ > #include <linux/init.h> > #include <linux/pm.h> > #include <linux/console.h> > +#include <linux/module.h> > > #include "power.h" > > @@ -49,7 +50,7 @@ void pm_set_ops(struct pm_ops * ops) > * the platform can enter the requested state. > */ > > -static int suspend_prepare(suspend_state_t state) > +static int suspend_prepare(struct oppoint * state) > { > int error = 0; > unsigned int free_pages; > @@ -82,7 +83,7 @@ static int suspend_prepare(suspend_state > } > > if (pm_ops->prepare) { > - if ((error = pm_ops->prepare(state))) > + if ((error = pm_ops->prepare(state->type))) > goto Thaw; > } > > @@ -94,7 +95,7 @@ static int suspend_prepare(suspend_state > return 0; > Finish: > if (pm_ops->finish) > - pm_ops->finish(state); > + pm_ops->finish(state->type); > Thaw: > thaw_processes(); > Enable_cpu: > @@ -104,7 +105,7 @@ static int suspend_prepare(suspend_state > } > > > -int suspend_enter(suspend_state_t state) > +int suspend_enter(struct oppoint * state) > { > int error = 0; > unsigned long flags; > @@ -115,7 +116,7 @@ int suspend_enter(suspend_state_t state) > printk(KERN_ERR "Some devices failed to power down\n"); > goto Done; > } > - error = pm_ops->enter(state); > + error = pm_ops->enter(state->type); > device_power_up(); > Done: > local_irq_restore(flags); > @@ -131,36 +132,82 @@ int suspend_enter(suspend_state_t state) > * console that we've allocated. This is not called for > suspend-to-disk. > */ > > -static void suspend_finish(suspend_state_t state) > +static void suspend_finish(struct oppoint * state) > { > device_resume(); > resume_console(); > thaw_processes(); > enable_nonboot_cpus(); > if (pm_ops && pm_ops->finish) > - pm_ops->finish(state); > + pm_ops->finish(state->type); > pm_restore_console(); > } > > +struct list_head pm_list; > +static struct oppoint standby = { > + .name = "standby", > + .type = PM_SUSPEND_STANDBY, > +}; > > +static struct oppoint mem = { > + .name = "mem", > + .type = PM_SUSPEND_MEM, > + .frequency = 0, > + .voltage = 0, > + .latency = 150, > +}; > > - > -static const char * const pm_states[PM_SUSPEND_MAX] = { > - [PM_SUSPEND_STANDBY] = "standby", > - [PM_SUSPEND_MEM] = "mem", > #ifdef CONFIG_SOFTWARE_SUSPEND > - [PM_SUSPEND_DISK] = "disk", > +struct oppoint disk = { > + .name = "disk", > + .type = PM_SUSPEND_DISK, > +}; > #endif > + > +struct oppoint pm_states = { > + .name = "default", > + .type = PM_FREQ_CHANGE, > }; > +struct oppoint *current_state; > + > +/* > + * > + */ > +static int pm_change_state(struct oppoint *state) > +{ > + int error = 0; > + > + printk("OpPoint: changing from %s to %s\n", > current_state->name, > + state->name); > + /* > + * compare to current operating point. > + * if different change to new operating point. > + */ > + if (current_state == state) > + goto out; > + > + if ((error = state->prepare_transition(current_state, state))) > + goto out; > + > + if ((error = state->transition(current_state, state))) > + state = current_state; > + > + if ((error = state->finish_transition(current_state, state)) > == 0) > + current_state = state; > + > +out: > + printk("OpPoint: State change returned %d\n", error); > + return error; > +} > > -static inline int valid_state(suspend_state_t state) > +static inline int valid_state(struct oppoint * state) > { > /* Suspend-to-disk does not really need low-level support. > * It can work with reboot if needed. */ > - if (state == PM_SUSPEND_DISK) > + if (state->type == PM_SUSPEND_DISK) > return 1; > > - if (pm_ops && pm_ops->valid && !pm_ops->valid(state)) > + if (pm_ops && pm_ops->valid && !pm_ops->valid(state->type)) > return 0; > return 1; > } > @@ -168,7 +215,7 @@ static inline int valid_state(suspend_st > > /** > * enter_state - Do common work of entering low-power state. > - * @state: pm_state structure for state we're entering. > + * @state: oppoint structure for state we're entering. > * > * Make sure we're the only ones trying to enter a sleep state. > Fail > * if someone has beat us to it, since we don't want anything > weird to > @@ -177,7 +224,7 @@ static inline int valid_state(suspend_st > * we've woken up). > */ > > -static int enter_state(suspend_state_t state) > +static int enter_state(struct oppoint *state) > { > int error; > > @@ -186,16 +233,21 @@ static int enter_state(suspend_state_t s > if (down_trylock(&pm_sem)) > return -EBUSY; > > - if (state == PM_SUSPEND_DISK) { > + if (state->type == PM_SUSPEND_DISK) { > error = pm_suspend_disk(); > goto Unlock; > } > > - pr_debug("PM: Preparing system for %s sleep\n", > pm_states[state]); > + if (state->type == PM_FREQ_CHANGE || state->type == > PM_VOLT_CHANGE) { > + error = pm_change_state(state); > + goto Unlock; > + } > + > + pr_debug("PM: Preparing system for %s sleep\n", state->name); > if ((error = suspend_prepare(state))) > goto Unlock; > > - pr_debug("PM: Entering %s sleep\n", pm_states[state]); > + pr_debug("PM: Entering %s sleep\n", state->name); > error = suspend_enter(state); > > pr_debug("PM: Finishing wakeup.\n"); > @@ -211,7 +263,15 @@ static int enter_state(suspend_state_t s > */ > int software_suspend(void) > { > - return enter_state(PM_SUSPEND_DISK); > + struct oppoint *this, *next; > + struct list_head *head = &mem.list; > + int error = 0; > + > + list_for_each_entry_safe(this, next, head, list) { > + if (this->type == PM_SUSPEND_DISK) > + error= enter_state(this); > + } > + return error; > } > > > @@ -223,9 +283,9 @@ int software_suspend(void) > * structure, and enter (above). > */ > > -int pm_suspend(suspend_state_t state) > +int pm_suspend(struct oppoint * state) > { > - if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX) > + if (state->type > PM_SUSPEND_ON && state->type <= > PM_SUSPEND_MAX) > return enter_state(state); > return -EINVAL; > } > @@ -248,36 +308,35 @@ decl_subsys(power,NULL,NULL); > > static ssize_t state_show(struct subsystem * subsys, char * buf) > { > - int i; > - char * s = buf; > + struct oppoint *this, *next; > + struct list_head *head = &pm_list; > + char *s = buf; > + > + list_for_each_entry_safe(this, next, head, list) > + s += sprintf(s,"%s ", this->name); > > - for (i = 0; i < PM_SUSPEND_MAX; i++) { > - if (pm_states[i] && valid_state(i)) > - s += sprintf(s,"%s ", pm_states[i]); > - } > s += sprintf(s,"\n"); > + > return (s - buf); > } > > static ssize_t state_store(struct subsystem * subsys, const char * > buf, size_t n) > { > - suspend_state_t state = PM_SUSPEND_STANDBY; > - const char * const *s; > + struct oppoint *this, *next; > + struct list_head *head = &mem.list; > char *p; > - int error; > + int error = -EINVAL; > int len; > > p = memchr(buf, '\n', n); > len = p ? p - buf : n; > - > - for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, > state++) { > - if (*s && !strncmp(buf, *s, len)) > + list_for_each_entry_safe(this, next, head, list) { > + if ((strlen(this->name) == len) && > + (!strncmp(this->name, buf, len))) { > + error = enter_state(this); > break; > + } > } > - if (state < PM_SUSPEND_MAX && *s) > - error = enter_state(state); > - else > - error = -EINVAL; > return error ? error : n; > } > > @@ -292,12 +351,234 @@ static struct attribute_group attr_group > .attrs = g, > }; > > +static struct kobject oppoint_kobj = { > + .kset = &power_subsys.kset, > +}; > + > +struct oppoint_attribute { > + struct attribute attr; > + ssize_t (*show)(struct kobject * kobj, char * buf); > + ssize_t (*store)(struct kobject * kobj, const char * buf, > size_t count); > +}; > + > +#define to_oppoint(obj) container_of(obj,struct oppoint,kobj) > +#define to_oppoint_attr(_attr) container_of(_attr,struct > oppoint_attribute,attr) > +/* > + * the frequency, voltage and latency files are readonly > + */ > + > +static ssize_t oppoint_voltage_show(struct kobject * kobj, char * buf) > +{ > + ssize_t len; > + struct oppoint *opt = to_oppoint(kobj); > + > + len = sprintf(buf, "%8d\n", opt->voltage); > + > + return len; > +} > + > +static ssize_t oppoint_voltage_store(struct kobject * kobj, const > char * buf, > + size_t n) > +{ > + return -EINVAL; > + > +} > + > +static ssize_t oppoint_frequency_show(struct kobject * kobj, char * > buf) > +{ > + ssize_t len; > + struct oppoint *opt = to_oppoint(kobj); > + > + len = sprintf(buf, "%8d\n", opt->frequency); > + > + return len; > +} > + > +static ssize_t oppoint_frequency_store(struct kobject * kobj, > + const char * buf, size_t n) > +{ > + return -EINVAL; > + > +} > + > +static ssize_t oppoint_point_show(struct kobject * kobj, char * buf) > +{ > + ssize_t len; > + > + len = sprintf(buf, "%s\n", current_state->name); > + > + return len; > +} > + > +static ssize_t oppoint_point_store(struct kobject * kobj, const char > * buf, > + size_t n) > +{ > + struct oppoint *this, *next; > + struct list_head *head = &pm_states.list; > + char *p; > + int error = -EINVAL; > + int len; > + > + p = memchr(buf, '\n', n); > + len = p ? p - buf : n; > + list_for_each_entry_safe(this, next, head, list) { > + if ((strlen(this->name) == len) && > + (!strncmp(this->name, buf, len))) { > + error = enter_state(this); > + break; > + } > + } > + return error ? error : n; > +} > + > +static ssize_t oppoint_latency_show(struct kobject * kobj, char * buf) > +{ > + ssize_t len; > + struct oppoint *opt = to_oppoint(kobj); > + > + len = sprintf(buf, "%8d\n", opt->latency); > + > + return len; > +} > + > +static ssize_t oppoint_latency_store(struct kobject * kobj, > + const char * buf, size_t n) > +{ > + return -EINVAL; > + > +} > + > +static struct oppoint_attribute point_attr = { > + .attr = { > + .name = "current_point", > + .mode = 0600, > + }, > + .show = oppoint_point_show, > + .store = oppoint_point_store, > +}; > + > +static struct oppoint_attribute frequency_attr = { > + .attr = { > + .name = "frequency", > + .mode = 0400, > + }, > + .show = oppoint_frequency_show, > + .store = oppoint_frequency_store, > +}; > + > +static struct oppoint_attribute voltage_attr = { > + .attr = { > + .name = "voltage", > + .mode = 0400, > + }, > + .show = oppoint_voltage_show, > + .store = oppoint_voltage_store, > +}; > + > +static struct oppoint_attribute latency_attr = { > + .attr = { > + .name = "latency", > + .mode = 0400, > + }, > + .show = oppoint_latency_show, > + .store = oppoint_latency_store, > +}; > + > +static ssize_t > +oppoint_attr_show(struct kobject * kobj, struct attribute * attr, > char * buf) > +{ > + struct oppoint_attribute * opt_attr = to_oppoint_attr(attr); > + ssize_t ret = 0; > + > + if (opt_attr->show) > + ret = opt_attr->show(kobj,buf); > + return ret; > +} > + > +static ssize_t > +oppoint_attr_store(struct kobject * kobj, struct attribute * attr, > + const char * buf, size_t count) > +{ > + return -EINVAL; > +} > + > +static void oppoint_kobj_release(struct kobject *kobj) > +{ > + return; > +} > + > +static struct sysfs_ops oppoint_sysfs_ops = { > + .show = oppoint_attr_show, > + .store = oppoint_attr_store, > +}; > + > +static struct attribute * oppoint_default_attrs[] = { > + &frequency_attr.attr, > + &voltage_attr.attr, > + &latency_attr.attr, > + NULL, > +}; > + > +static struct kobj_type ktype_operating_point = { > + .release = oppoint_kobj_release, > + .sysfs_ops = &oppoint_sysfs_ops, > + .default_attrs = oppoint_default_attrs, > +}; > + > +int unregister_operating_point(struct oppoint *opt) > +{ > + down(&pm_sem); > + list_del_init(&opt->list); > + sysfs_remove_file(&opt->kobj, &frequency_attr.attr); > + sysfs_remove_file(&opt->kobj, &voltage_attr.attr); > + sysfs_remove_file(&opt->kobj, &latency_attr.attr); > + up(&pm_sem); > + > + return 0; > +} > +EXPORT_SYMBOL(unregister_operating_point); > + > +int register_operating_point(struct oppoint *opt) > +{ > + down(&pm_sem); > + kobject_set_name(&opt->kobj, opt->name); > + opt->kobj.kset = &power_subsys.kset; > + opt->kobj.parent = &oppoint_kobj; > + opt->kobj.ktype = &ktype_operating_point; > + kobject_register(&opt->kobj); > + > + sysfs_create_file(&opt->kobj, &frequency_attr.attr); > + sysfs_create_file(&opt->kobj, &voltage_attr.attr); > + sysfs_create_file(&opt->kobj, &latency_attr.attr); > + > + list_add_tail(&opt->list, &pm_states.list); > + up(&pm_sem); > + return 0; > +} > +EXPORT_SYMBOL(register_operating_point); > > static int __init pm_init(void) > { > + > int error = subsystem_register(&power_subsys); > - if (!error) > + if (!error) { > error = > sysfs_create_group(&power_subsys.kset.kobj,&attr_group); > + kobject_set_name(&oppoint_kobj, "operating_points"); > + kobject_register(&oppoint_kobj); > + sysfs_create_file(&oppoint_kobj, &point_attr.attr); > + } > + > + > + INIT_LIST_HEAD(&pm_states.list); > + INIT_LIST_HEAD(&pm_list); > + > +#ifdef CONFIG_SOFTWARE_SUSPEND > + list_add(&disk.list, &pm_list); > +#endif > + list_add(&standby.list, &pm_list); > + list_add(&mem.list, &pm_list); > + current_state = &pm_states; > + > return error; > } > > Index: linux-2.6.17/include/linux/pm.h > =================================================================== > --- linux-2.6.17.orig/include/linux/pm.h > +++ linux-2.6.17/include/linux/pm.h > @@ -24,6 +24,7 @@ > #ifdef __KERNEL__ > > #include <linux/list.h> > +#include <linux/kobject.h> > #include <asm/atomic.h> > > /* > @@ -108,7 +109,32 @@ typedef int __bitwise suspend_state_t; > #define PM_SUSPEND_STANDBY ((__force suspend_state_t) 1) > #define PM_SUSPEND_MEM ((__force suspend_state_t) 3) > #define PM_SUSPEND_DISK ((__force suspend_state_t) 4) > -#define PM_SUSPEND_MAX ((__force suspend_state_t) 5) > +#define PM_FREQ_CHANGE ((__force suspend_state_t) 5) > +#define PM_VOLT_CHANGE ((__force suspend_state_t) 6) > +#define PM_SUSPEND_MAX ((__force suspend_state_t) 7) > + > +struct oppoint { > + struct list_head list; > + suspend_state_t type; > + char *name; > + unsigned int flags; > + unsigned int frequency; /* in KHz */ > + unsigned int voltage; /* mV */ > + unsigned int latency; /* transition latency in us */ > + int (*prepare_transition)(struct oppoint *cur, struct > oppoint *new); > + int (*transition)(struct oppoint *cur, struct oppoint > *new); > + int (*finish_transition)(struct oppoint *cur, struct > oppoint *new); > + > + void *md_data; /* arch dependent data */ > + struct kobject kobj; > +}; > + > + > +extern struct oppoint pm_states; > +extern struct oppoint *current_state; > +extern int register_operating_point(struct oppoint *opt); > +extern int unregister_operating_point(struct oppoint *opt); > +struct notifier_block; > > typedef int __bitwise suspend_disk_method_t; > > @@ -128,7 +154,7 @@ struct pm_ops { > > extern void pm_set_ops(struct pm_ops *); > extern struct pm_ops *pm_ops; > -extern int pm_suspend(suspend_state_t state); > +extern int pm_suspend(struct oppoint *state); > > > /* > Index: linux-2.6.17/kernel/power/power.h > =================================================================== > --- linux-2.6.17.orig/kernel/power/power.h > +++ linux-2.6.17/kernel/power/power.h > @@ -113,4 +113,4 @@ extern int swsusp_resume(void); > extern int swsusp_read(void); > extern int swsusp_write(void); > extern void swsusp_close(void); > -extern int suspend_enter(suspend_state_t state); > +extern int suspend_enter(struct oppoint * state); > _______________________________________________ > linux-pm mailing list > linux-pm@lists.osdl.org > https://lists.osdl.org/mailman/listinfo/linux-pm > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-09-18 14:33 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-09-18 13:36 [linux-pm] OpPoint summary Scott E. Preece 2006-09-18 13:46 ` Pavel Machek -- strict thread matches above, loose matches on Subject: below -- 2006-09-11 8:20 community PM requirements/issues and PowerOP [Was: Re: So, what's the status on the recent patches here?] Pavel Machek 2006-09-11 19:55 ` Pavel Machek 2006-09-11 20:53 ` Eugeny S. Mints 2006-09-11 21:00 ` Pavel Machek 2006-09-11 22:05 ` Eugeny S. Mints 2006-09-11 22:56 ` cpufreq terminally broken [was Re: community PM requirements/issues and PowerOP] Pavel Machek 2006-09-12 0:17 ` Mark Gross 2006-09-12 3:37 ` Greg KH 2006-09-13 23:50 ` [linux-pm] " David Singleton 2006-09-14 5:55 ` OpPoint summary Greg KH 2006-09-14 17:07 ` David Singleton 2006-09-14 17:25 ` Auke Kok 2006-09-14 18:15 ` [linux-pm] " Vitaly Wool 2006-09-17 17:48 ` Pavel Machek 2006-09-18 14:33 ` [linux-pm] " Richard A. Griffiths 2006-09-17 5:07 ` David Singleton 2006-09-17 22:43 ` [linux-pm] " Matthew Locke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox