* [PATCH] If a CPU gets onlined set the governor to the one that is run on other CPUs
@ 2006-11-23 15:37 Thomas Renninger
2006-11-26 22:21 ` Dave Jones
0 siblings, 1 reply; 6+ messages in thread
From: Thomas Renninger @ 2006-11-23 15:37 UTC (permalink / raw)
To: cpufreq; +Cc: Dave Jones
Hi,
I wonder whether there is any real use to allow userspace to
run different governors on different CPUs?
This complicates things (in kernel and userspace) and
if not really needed I'd link up all
sys/../cpu*/cpufreq/scaling_governor files?
If a CPU gets onlined set the governor to the one that is run on other CPUs
If you offline a CPU and online it again (as done by swsusp on SMP),
the cpufreq governor jumps back to performance (or compiled in default
governor).
With this patch it's working like (if CPU is onlined):
check governor of other CPUs, if
a) all run with the same governor, set governor of CPU that
gets onlined to the same governor that is run on the others
b) if different governors are running on different CPUs, set
the CPU that got onlined to default governor
That means for userspace progs accessing/controlling cpufreq stuff, they should
*never ever* run different governors on the same machine at the same time (on
different CPUs). IMO this feature should be disabled by kernel anyway, but
there might be cases where it makes sense, don't know.
This fix is necessary as any prog is allowed to offline the CPU and cpufreq
must still work. Hal/powersaved or other userspace apps that control suspend
triggering and cpufreq control might be able to workaround the swsuspend resume
case, others that only care about cpufreq, e.g. cpufreqd cannot.
Especially ondemand governor should IMO always just work out of the box by simply
doing: echo ondemand >/sys/../scaling_governor (unfortunately this has to be done
for each cpu currently...).
Thanks,
Thomas
Signed-off-by: Thomas Renninger <trenn@suse.de>
drivers/cpufreq/cpufreq.c | 25 +++++++++++++++++++++++++
1 files changed, 25 insertions(+)
Index: linux-2.6.18_cpufreq_debug_i386/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-2.6.18_cpufreq_debug_i386.orig/drivers/cpufreq/cpufreq.c
+++ linux-2.6.18_cpufreq_debug_i386/drivers/cpufreq/cpufreq.c
@@ -623,6 +623,7 @@ static int cpufreq_add_dev (struct sys_d
unsigned int j;
#ifdef CONFIG_SMP
struct cpufreq_policy *managed_policy;
+ struct cpufreq_governor *governor = NULL;
#endif
if (cpu_is_offline(cpu))
@@ -743,6 +744,30 @@ static int cpufreq_add_dev (struct sys_d
* run in cpufreq_set_policy */
mutex_unlock(&policy->lock);
+#ifdef CONFIG_SMP
+ /* Set governor of added CPU to the same governor running on other CPUs
+ If different governors are run on differnt CPUs default gov
+ will be taken */
+ for (j=0; j<NR_CPUS; j++){
+ if (j == cpu)
+ continue;
+ if (cpufreq_cpu_data[j]){
+ if (!governor)
+ governor = cpufreq_cpu_data[j]->governor;
+ else{
+ if (governor != cpufreq_cpu_data[j]->governor)
+ /* different governors running on
+ different CPUs -> we will start
+ default governor on this one... */
+ governor = NULL;
+ break;
+ }
+ }
+ }
+ if (governor)
+ new_policy.governor = governor;
+#endif
+
/* set default policy */
ret = cpufreq_set_policy(&new_policy);
if (ret) {
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] If a CPU gets onlined set the governor to the one that is run on other CPUs
2006-11-23 15:37 [PATCH] If a CPU gets onlined set the governor to the one that is run on other CPUs Thomas Renninger
@ 2006-11-26 22:21 ` Dave Jones
2006-11-27 3:04 ` Dominik Brodowski
2006-11-27 9:47 ` Thomas Renninger
0 siblings, 2 replies; 6+ messages in thread
From: Dave Jones @ 2006-11-26 22:21 UTC (permalink / raw)
To: Thomas Renninger; +Cc: cpufreq
On Thu, Nov 23, 2006 at 04:37:17PM +0100, Thomas Renninger wrote:
> Hi,
>
> I wonder whether there is any real use to allow userspace to
> run different governors on different CPUs?
> This complicates things (in kernel and userspace) and
> if not really needed I'd link up all
> sys/../cpu*/cpufreq/scaling_governor files?
I've thought about this a few times, and I agree that it's fairly pointless,
especially with ht/multicore CPUs. Given it doesn't really bring anything
but added complexity for userspace, I'm inclined to agree that this
is the direction we should take unless someone has a compelling argument?
Dave
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] If a CPU gets onlined set the governor to the one that is run on other CPUs
2006-11-26 22:21 ` Dave Jones
@ 2006-11-27 3:04 ` Dominik Brodowski
2006-11-27 9:47 ` Thomas Renninger
1 sibling, 0 replies; 6+ messages in thread
From: Dominik Brodowski @ 2006-11-27 3:04 UTC (permalink / raw)
To: Dave Jones; +Cc: cpufreq
On Sun, Nov 26, 2006 at 05:21:20PM -0500, Dave Jones wrote:
> On Thu, Nov 23, 2006 at 04:37:17PM +0100, Thomas Renninger wrote:
> > Hi,
> >
> > I wonder whether there is any real use to allow userspace to
> > run different governors on different CPUs?
> > This complicates things (in kernel and userspace) and
> > if not really needed I'd link up all
> > sys/../cpu*/cpufreq/scaling_governor files?
>
> I've thought about this a few times, and I agree that it's fairly pointless,
> especially with ht/multicore CPUs. Given it doesn't really bring anything
> but added complexity for userspace, I'm inclined to agree that this
> is the direction we should take unless someone has a compelling argument?
Seconded. It was a bad idea from the beginning, and I know who is to blame
for that:
Dominik
;)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] If a CPU gets onlined set the governor to the one that is run on other CPUs
2006-11-26 22:21 ` Dave Jones
2006-11-27 3:04 ` Dominik Brodowski
@ 2006-11-27 9:47 ` Thomas Renninger
2006-11-27 17:33 ` Dave Jones
1 sibling, 1 reply; 6+ messages in thread
From: Thomas Renninger @ 2006-11-27 9:47 UTC (permalink / raw)
To: Dave Jones; +Cc: cpufreq, Stefan Seyfried
On Sun, 2006-11-26 at 17:21 -0500, Dave Jones wrote:
> On Thu, Nov 23, 2006 at 04:37:17PM +0100, Thomas Renninger wrote:
> > Hi,
> >
> > I wonder whether there is any real use to allow userspace to
> > run different governors on different CPUs?
> > This complicates things (in kernel and userspace) and
> > if not really needed I'd link up all
> > sys/../cpu*/cpufreq/scaling_governor files?
>
> I've thought about this a few times, and I agree that it's fairly pointless,
> especially with ht/multicore CPUs. Given it doesn't really bring anything
> but added complexity for userspace, I'm inclined to agree that this
> is the direction we should take unless someone has a compelling argument?
Rethinking about all this.., including some arguments from Seife.., I am
not that sure whether it was that bad to provide that capability.
Thinking about x86 may get 16/32/64 CPU sockets the next years, it may
be convenient for very specific guys to run some nodes at highest
performance to even avoid some 2% performance regression on database
nodes while others are run with ondemand.
Having some S390 like configuration for future, where you can configure
each CPU/Node as you like, may come out to be useful (Even I don't see
any use of that atm, even not that much in future, it may be at least
relevant for marketing purposes? I am thinking about a complex per CPU
configuration utility...). Independent cpufreq policy will never make
sense on a laptop, but possibly on huge high-end machines?
I just wonder how to ease up things?
On longterm IMO some kind of non-CPU specific cpufreq directory would be
needed if this feature should still be provided:
/sys/devices/cpu/cpufreq
There could be a default-governor file. All newly added CPUs will use
this one.
Switching between battery/AC governor would not need listening on some
"suspend or CPU offline event" then.
Hal already workarounds that atm (at least the suspend case, not sure if
this is already upstream. It does not handle the "someone just offlined
a CPU" case AFAIK, but probably also will soon).
Above would be sufficient to still provide full flexibility
(unfortunately still the whole complexity in kernel) and avoid ugly
workarounds in userspace like: If CPU online event received via udev
(which does not exist yet?), get the current governor that should run
there from Hal cpufreq module and reset it.
On the other hand, if you are really sure this is never ever needed on
any architecture, it's probably better to rip out that feature totally,
I am not that sure any more...
Thanks,
Thomas
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] If a CPU gets onlined set the governor to the one that is run on other CPUs
2006-11-27 9:47 ` Thomas Renninger
@ 2006-11-27 17:33 ` Dave Jones
2006-11-28 10:40 ` Ashley Pittman
0 siblings, 1 reply; 6+ messages in thread
From: Dave Jones @ 2006-11-27 17:33 UTC (permalink / raw)
To: Thomas Renninger; +Cc: cpufreq, Stefan Seyfried
On Mon, Nov 27, 2006 at 10:47:51AM +0100, Thomas Renninger wrote:
> Thinking about x86 may get 16/32/64 CPU sockets the next years, it may
> be convenient for very specific guys to run some nodes at highest
> performance to even avoid some 2% performance regression on database
> nodes while others are run with ondemand.
For the HPC folks I've spoken with, this doesn't seem likely.
If they care at all about that 2% (which they do), they'll want that
out of every CPU, and not leave anything idle.
(Typically I hear from the HPC folks "how do I turn this cpufreq thing off?")
I'd rather we looked into merging conservative & ondemand, and have
ondemand be tunable to not scale so frequently for such users.
Dave
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] If a CPU gets onlined set the governor to the one that is run on other CPUs
2006-11-27 17:33 ` Dave Jones
@ 2006-11-28 10:40 ` Ashley Pittman
0 siblings, 0 replies; 6+ messages in thread
From: Ashley Pittman @ 2006-11-28 10:40 UTC (permalink / raw)
To: Dave Jones; +Cc: cpufreq, Stefan Seyfried
On Mon, 2006-11-27 at 12:33 -0500, Dave Jones wrote:
> On Mon, Nov 27, 2006 at 10:47:51AM +0100, Thomas Renninger wrote:
>
> > Thinking about x86 may get 16/32/64 CPU sockets the next years, it may
> > be convenient for very specific guys to run some nodes at highest
> > performance to even avoid some 2% performance regression on database
> > nodes while others are run with ondemand.
>
> For the HPC folks I've spoken with, this doesn't seem likely.
> If they care at all about that 2% (which they do), they'll want that
> out of every CPU, and not leave anything idle.
With the onset of cpusets this assertion possibly won't hold true for
much longer however, we would be willing to only get 100% out of the
cpu's which have HPC jobs assigned to then.
Ashley,
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-11-28 10:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-23 15:37 [PATCH] If a CPU gets onlined set the governor to the one that is run on other CPUs Thomas Renninger
2006-11-26 22:21 ` Dave Jones
2006-11-27 3:04 ` Dominik Brodowski
2006-11-27 9:47 ` Thomas Renninger
2006-11-27 17:33 ` Dave Jones
2006-11-28 10:40 ` Ashley Pittman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.