* Re: cpufreq and p4 prescott
@ 2004-05-13 17:39 Dominik Brodowski
2004-05-14 21:47 ` rutger
0 siblings, 1 reply; 6+ messages in thread
From: Dominik Brodowski @ 2004-05-13 17:39 UTC (permalink / raw)
To: cpufreq, linux-kernel, rutger, moqua
[-- Attachment #1: Type: text/plain, Size: 2409 bytes --]
> > So i'm not sure if throttling does work until now?
>
> No, I think something is broken. There is something wrong, but I do
> not know what exactly. Maybe other people can help.
>
> Problem #1 is the speed measurement, as you described. As far as I
> understand, p4-clockmod delivers the same external clock to the P4,
> but work is not done during every clock tick. E.g. when running at
> 12.5% of the maximum frequency, only one tick in eight something is
> done.
Almost. The Time Stamp Counter (inside the CPU) works with a constant
frequency, but only at e.g. each eigth tick the other parts of the CPU do
some work.
> Ok, so if it is true that only the work is done part of the ticks,
> then all instructions should take more ticks! Therefore, I try to
> measure the number of ticks which the 'rdtsc' instruction itself
> takes. I take the minimum of 10 runs, to run
> instruction-cache-hot. See cpuclockmod.c .
>
> This gives '140' cycles in the pre-modulated phase (including some
> overhead) when running on an idle system, and 154 or 161 running on a
> loaded system (1 thread busy looping). If clock modulation meant
> 'skipping ticks', I would expect this number to multiply.
Not necessarily. It's not really every eigth tick where work is done, but
more like 800 ticks where work is done, then 5600 ticks pause, and so on --
the frequency is somewhere in the docs, I forgot the exact value... So I'm
not 100% convinced the measurements you've done do show something broken.
> This doesn't change a thing, which is to be expected since cpufreq
> talks to real CPUs.
It should, something _is_ broken in this regard [and I'm working on it, just
had sent a RFC to the cpufreq mailing list...]. Maybe this causes some
strangeness, especially if you run a userspace cpufreq tool -- but maybe the
p4-clockmod hardware is even more strange than I thought it to be, and is
per _virtual_ CPU.
Can you please apply the latest cpufreq snapshot from
http://www.codemonkey.org.uk/projects/bitkeeper/cpufreq/
, then the attached patch, and switch the CPU frequencies of both (virtual)
CPUs around a bit, and after each switch,
cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq
and check whether the values are the same you wrote into the specific CPU's
scaling_setspeed [if using the userspace governor] file?
Many thanks,
Dominik
[-- Attachment #2: test_p4 --]
[-- Type: text/plain, Size: 1018 bytes --]
diff -ruN linux-original/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c linux/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
--- linux-original/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c 2004-05-13 16:52:02.000000000 +0200
+++ linux/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c 2004-05-13 19:36:47.629852152 +0200
@@ -68,11 +68,7 @@
cpus_allowed = current->cpus_allowed;
/* only run on CPU to be set, or on its sibling */
-#ifdef CONFIG_SMP
- affected_cpu_map = cpu_sibling_map[cpu];
-#else
affected_cpu_map = cpumask_of_cpu(cpu);
-#endif
set_cpus_allowed(current, affected_cpu_map);
BUG_ON(!cpu_isset(smp_processor_id(), affected_cpu_map));
@@ -273,11 +269,7 @@
/* only run on CPU to be set, or on its sibling */
cpus_allowed = current->cpus_allowed;
-#ifdef CONFIG_SMP
- affected_cpu_map = cpu_sibling_map[cpu];
-#else
affected_cpu_map = cpumask_of_cpu(cpu);
-#endif
set_cpus_allowed(current, affected_cpu_map);
BUG_ON(!cpu_isset(smp_processor_id(), affected_cpu_map));
[-- Attachment #3: Type: text/plain, Size: 143 bytes --]
_______________________________________________
Cpufreq mailing list
Cpufreq@www.linux.org.uk
http://www.linux.org.uk/mailman/listinfo/cpufreq
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cpufreq and p4 prescott
2004-05-13 17:39 cpufreq and p4 prescott Dominik Brodowski
@ 2004-05-14 21:47 ` rutger
2004-05-15 6:44 ` Dominik Brodowski
0 siblings, 1 reply; 6+ messages in thread
From: rutger @ 2004-05-14 21:47 UTC (permalink / raw)
To: cpufreq, linux-kernel, moqua
> > Problem #1 is the speed measurement, as you described. As far as I
> > understand, p4-clockmod delivers the same external clock to the P4,
> > but work is not done during every clock tick. E.g. when running at
> > 12.5% of the maximum frequency, only one tick in eight something is
> > done.
>
> Almost. The Time Stamp Counter (inside the CPU) works with a constant
> frequency, but only at e.g. each eigth tick the other parts of the CPU do
> some work.
That's what I meant.
>
> > Ok, so if it is true that only the work is done part of the ticks,
> > then all instructions should take more ticks! Therefore, I try to
> > measure the number of ticks which the 'rdtsc' instruction itself
> > takes. I take the minimum of 10 runs, to run
> > instruction-cache-hot. See cpuclockmod.c .
> >
> > This gives '140' cycles in the pre-modulated phase (including some
> > overhead) when running on an idle system, and 154 or 161 running on a
> > loaded system (1 thread busy looping). If clock modulation meant
> > 'skipping ticks', I would expect this number to multiply.
>
> Not necessarily. It's not really every eigth tick where work is done, but
> more like 800 ticks where work is done, then 5600 ticks pause, and so on --
> the frequency is somewhere in the docs, I forgot the exact value... So I'm
> not 100% convinced the measurements you've done do show something broken.
Ah, ok! This makes the measurement next to impossible. Unless we
generate instructions of ~900 ticks, which should takes 900 + 5600
ticks in case of modulated clock, and 900 ticks in case of
non-modulated clock. Something to try...
> > This doesn't change a thing, which is to be expected since cpufreq
> > talks to real CPUs.
>
> It should, something _is_ broken in this regard [and I'm working on it, just
> had sent a RFC to the cpufreq mailing list...]. Maybe this causes some
> strangeness, especially if you run a userspace cpufreq tool -- but maybe the
> p4-clockmod hardware is even more strange than I thought it to be, and is
> per _virtual_ CPU.
>
> Can you please apply the latest cpufreq snapshot from
> http://www.codemonkey.org.uk/projects/bitkeeper/cpufreq/
> , then the attached patch, and switch the CPU frequencies of both (virtual)
> CPUs around a bit, and after each switch,
> cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
> cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq
>
> and check whether the values are the same you wrote into the specific CPU's
> scaling_setspeed [if using the userspace governor] file?
Ok, I applied both patches.
root@localhost /sys/devices/system/cpu/cpu0/cpufreq# cat scaling_available_frequencies
350000 700000 1050000 1400000 1750000 2100000 2450000 2800000
root@localhost /sys/devices/system/cpu/cpu0/cpufreq# for f in `cat scaling_available_frequencies `; do echo $f >scaling_setspeed ; cat scaling_cur_freq ; done
350000
700000
1050000
1400000
1750000
2100000
2450000
2800000
Seems to work...
Some remarks:
- scaling_governor and scaling_setspeed get length 0 after echo-ing to.
Other files keep the virtual size of 4096.
- scaling seems to work reliable now _if_ I repeat the scaling for
each virtual processor and make them the same. It doesn't do
anything useful if I only set cpu0.
- It's far more repeatable now. If I set the speed of virtual CPU0,
it really sets it, and only sets CPU0, and not like previously only
in 50% of the cases or so.
However, what's the use of p4-clockmod if it doesn't have impact on
the temperature and the power consumption of the CPU?
My Asus p4p800 seems to be able to set several voltages and frequences
in the BIOS; can those be set runtime? And/or is there any
documentation on this? This would make for a much more useful driver.
Thanks!
>
> Many thanks,
> Dominik
> diff -ruN linux-original/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c linux/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c
> --- linux-original/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c 2004-05-13 16:52:02.000000000 +0200
> +++ linux/arch/i386/kernel/cpu/cpufreq/p4-clockmod.c 2004-05-13 19:36:47.629852152 +0200
> @@ -68,11 +68,7 @@
> cpus_allowed = current->cpus_allowed;
>
> /* only run on CPU to be set, or on its sibling */
> -#ifdef CONFIG_SMP
> - affected_cpu_map = cpu_sibling_map[cpu];
> -#else
> affected_cpu_map = cpumask_of_cpu(cpu);
> -#endif
> set_cpus_allowed(current, affected_cpu_map);
> BUG_ON(!cpu_isset(smp_processor_id(), affected_cpu_map));
>
> @@ -273,11 +269,7 @@
>
> /* only run on CPU to be set, or on its sibling */
> cpus_allowed = current->cpus_allowed;
> -#ifdef CONFIG_SMP
> - affected_cpu_map = cpu_sibling_map[cpu];
> -#else
> affected_cpu_map = cpumask_of_cpu(cpu);
> -#endif
> set_cpus_allowed(current, affected_cpu_map);
> BUG_ON(!cpu_isset(smp_processor_id(), affected_cpu_map));
>
--
Rutger Nijlunsing ---------------------------- rutger ed tux tmfweb nl
never attribute to a conspiracy which can be explained by incompetence
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cpufreq and p4 prescott
2004-05-14 21:47 ` rutger
@ 2004-05-15 6:44 ` Dominik Brodowski
2004-05-15 10:52 ` rutger
0 siblings, 1 reply; 6+ messages in thread
From: Dominik Brodowski @ 2004-05-15 6:44 UTC (permalink / raw)
To: linux-kernel; +Cc: moqua, cpufreq, linux-kernel
On Fri, May 14, 2004 at 11:47:51PM +0200, rutger@nospam.com wrote:
> > Not necessarily. It's not really every eigth tick where work is done, but
> > more like 800 ticks where work is done, then 5600 ticks pause, and so on --
> > the frequency is somewhere in the docs, I forgot the exact value... So I'm
> > not 100% convinced the measurements you've done do show something broken.
>
> Ah, ok! This makes the measurement next to impossible. Unless we
> generate instructions of ~900 ticks, which should takes 900 + 5600
> ticks in case of modulated clock, and 900 ticks in case of
> non-modulated clock. Something to try...
As I said, I forgot the actual frequency, so 800 ticks is a guess...
> root@localhost /sys/devices/system/cpu/cpu0/cpufreq# cat scaling_available_frequencies
> 350000 700000 1050000 1400000 1750000 2100000 2450000 2800000
> root@localhost /sys/devices/system/cpu/cpu0/cpufreq# for f in `cat scaling_available_frequencies `; do echo $f >scaling_setspeed ; cat scaling_cur_freq ; done
> 350000
> 700000
> 1050000
> 1400000
> 1750000
> 2100000
> 2450000
> 2800000
>
> Seems to work...
Hm, could you please do
# for f in `cat scaling_available_frequencies `; do echo $f >scaling_setspeed ; cat cpuinfo_cur_freq ; cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq ; done
instead? scaling_cur_freq doesn't give as useful _debug_ results as
cpuinfo_cur_freq, and it's important to get it for _both_ siblings after
_each_ change
> Some remarks:
> - scaling_governor and scaling_setspeed get length 0 after echo-ing to.
> Other files keep the virtual size of 4096.
That's some sort of sysfs "handling" - don't know about details and
consequences.
> - scaling seems to work reliable now _if_ I repeat the scaling for
> each virtual processor and make them the same. It doesn't do
> anything useful if I only set cpu0.
Maybe because much/more work is done by the other sibling then... however,
without the test above [cpuinfo_cur_freq for both siblings] I can't say
much, I'm afraid.
> However, what's the use of p4-clockmod if it doesn't have impact on
> the temperature and the power consumption of the CPU?
The use of the p4-clockmod driver is that it puts the CPU into a low-power
state -- it only has thermal and power consequences, however, if either the
"idling" does not work, or the processor load is higher than the frequency
the CPU is put into by p4-clockmod.
> My Asus p4p800 seems to be able to set several voltages and frequences
> in the BIOS; can those be set runtime?
No. This is motherboard-specific. The P4 does not support _voltage scaling_,
i.e. runtime voltage adjustment based on current power needs. It also
doesn't support _frequency scaling_, just (thermal) throttling.
> And/or is there any
> documentation on this? This would make for a much more useful driver.
Unfortunately, p4-clockmod isn't really all that useful -- but that's
because the hardware doesn't support voltage and/or frequency scaling.
Dominik
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cpufreq and p4 prescott
2004-05-15 6:44 ` Dominik Brodowski
@ 2004-05-15 10:52 ` rutger
2004-05-15 19:41 ` Dominik Brodowski
0 siblings, 1 reply; 6+ messages in thread
From: rutger @ 2004-05-15 10:52 UTC (permalink / raw)
To: cpufreq, linux-kernel, moqua
> > Ah, ok! This makes the measurement next to impossible. Unless we
> > generate instructions of ~900 ticks, which should takes 900 + 5600
> > ticks in case of modulated clock, and 900 ticks in case of
> > non-modulated clock. Something to try...
>
> As I said, I forgot the actual frequency, so 800 ticks is a guess...
The only thing I could find in Intel's documentation is the max. time
of throttling is 3 microseconds (p.67; 5.2.1 of Prescott
datasheet). So this 3 microseconds should correspond to 5600 ticks or
so...
>
> > root@localhost /sys/devices/system/cpu/cpu0/cpufreq# cat scaling_available_frequencies
> > 350000 700000 1050000 1400000 1750000 2100000 2450000 2800000
> > root@localhost /sys/devices/system/cpu/cpu0/cpufreq# for f in `cat scaling_available_frequencies `; do echo $f >scaling_setspeed ; cat scaling_cur_freq ; done
> > 350000
> > 700000
> > 1050000
> > 1400000
> > 1750000
> > 2100000
> > 2450000
> > 2800000
> >
> > Seems to work...
> Hm, could you please do
>
> # for f in `cat scaling_available_frequencies `; do echo $f >scaling_setspeed ; cat cpuinfo_cur_freq ; cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq ; done
>
> instead? scaling_cur_freq doesn't give as useful _debug_ results as
> cpuinfo_cur_freq, and it's important to get it for _both_ siblings after
> _each_ change
/sys/devices/system/cpu/cpu0/cpufreq# for f in `cat scaling_available_frequencies `; do echo $f >scaling_setspeed ; cat cpuinfo_cur_freq ; cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq; done
350000
2800000
700000
2800000
1050000
2800000
1400000
2800000
1750000
2800000
2100000
2800000
2450000
2800000
2800000
2800000
..so it has only effect on the same sibbling, not the other. That's
what I meant with 'repeat scaling for each virtual processor'.
> > - scaling seems to work reliable now _if_ I repeat the scaling for
> > each virtual processor and make them the same. It doesn't do
> > anything useful if I only set cpu0.
>
> Maybe because much/more work is done by the other sibling then... however,
> without the test above [cpuinfo_cur_freq for both siblings] I can't say
> much, I'm afraid.
That's true. I can set the freq. of each virtual CPU. Probably not
very useful, and even confusing. And if we keep this, the scheduler
should be told about the speed differences of both (virtual)
processors.
> > However, what's the use of p4-clockmod if it doesn't have impact on
> > the temperature and the power consumption of the CPU?
>
> The use of the p4-clockmod driver is that it puts the CPU into a low-power
> state -- it only has thermal and power consequences, however, if either the
> "idling" does not work, or the processor load is higher than the frequency
> the CPU is put into by p4-clockmod.
I saw several sleep states in which the processor can reside (like
when using the 'hlt' instruction) like S3; would those help?
>
> > My Asus p4p800 seems to be able to set several voltages and frequences
> > in the BIOS; can those be set runtime?
>
> No. This is motherboard-specific. The P4 does not support _voltage scaling_,
> i.e. runtime voltage adjustment based on current power needs. It also
> doesn't support _frequency scaling_, just (thermal) throttling.
I know this is not P4 specific, but motherboard specific, but do
you know of modules which use motherboard specific knowledge to scale
the processor? If the BIOS can do it, so should we be able to do it.
Regards,
Rutger.
--
Rutger Nijlunsing ---------------------------- rutger ed tux tmfweb nl
never attribute to a conspiracy which can be explained by incompetence
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cpufreq and p4 prescott
2004-05-15 10:52 ` rutger
@ 2004-05-15 19:41 ` Dominik Brodowski
2004-05-16 13:20 ` Rutger Nijlunsing
0 siblings, 1 reply; 6+ messages in thread
From: Dominik Brodowski @ 2004-05-15 19:41 UTC (permalink / raw)
To: linux-kernel; +Cc: moqua, cpufreq, linux-kernel
On Sat, May 15, 2004 at 12:52:01PM +0200, rutger@nospam.com wrote:
> > > Ah, ok! This makes the measurement next to impossible. Unless we
> > > generate instructions of ~900 ticks, which should takes 900 + 5600
> > > ticks in case of modulated clock, and 900 ticks in case of
> > > non-modulated clock. Something to try...
> >
> > As I said, I forgot the actual frequency, so 800 ticks is a guess...
>
> The only thing I could find in Intel's documentation is the max. time
> of throttling is 3 microseconds (p.67; 5.2.1 of Prescott
> datasheet). So this 3 microseconds should correspond to 5600 ticks or
> so...
Can't find it in the datasheets right now, but did find an interesting
comment in section 13.15.3 of 24547212.pdf which explains the strange
behaviour we're seeing.
> ..so it has only effect on the same sibbling, not the other. That's
> what I meant with 'repeat scaling for each virtual processor'.
This is so strange... but it is what's to be found in said section in the
datasheet. It says both logical CPUs need to be set identically so that it
works "properly", i.e. as expected.
> That's true. I can set the freq. of each virtual CPU. Probably not
> very useful, and even confusing. And if we keep this,
I think we should not keep it; I'll prepare a patch soon.
> the scheduler
> should be told about the speed differences of both (virtual)
> processors.
On (real) SMP systems this is an issue; but even more on SMP systems where
true frequency and voltage scaling is done.
> > > However, what's the use of p4-clockmod if it doesn't have impact on
> > > the temperature and the power consumption of the CPU?
> >
> > The use of the p4-clockmod driver is that it puts the CPU into a low-power
> > state -- it only has thermal and power consequences, however, if either the
> > "idling" does not work, or the processor load is higher than the frequency
> > the CPU is put into by p4-clockmod.
>
> I saw several sleep states in which the processor can reside (like
> when using the 'hlt' instruction) like S3; would those help?
If you mean C3, then that's very good. ACPI C-States are "idling" -- ACPI
S-States (like S3) are for "suspend to ram/disk"
> I know this is not P4 specific, but motherboard specific, but do
> you know of modules which use motherboard specific knowledge to scale
> the processor?
No.
> If the BIOS can do it, so should we be able to do it.
Dynamic frequency scaling is (probably) way different from setting a
frequency at boot (which is what the BIOS does). Timing issues, settling
times, etc. are way too complicated, AFAICS. Even trying to do this might
result in severe non-recoverable hardware failures.
Dominik
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cpufreq and p4 prescott
2004-05-15 19:41 ` Dominik Brodowski
@ 2004-05-16 13:20 ` Rutger Nijlunsing
0 siblings, 0 replies; 6+ messages in thread
From: Rutger Nijlunsing @ 2004-05-16 13:20 UTC (permalink / raw)
To: cpufreq, linux-kernel, moqua
> > The only thing I could find in Intel's documentation is the max. time
> > of throttling is 3 microseconds (p.67; 5.2.1 of Prescott
> > datasheet). So this 3 microseconds should correspond to 5600 ticks or
> > so...
>
> Can't find it in the datasheets right now, but did find an interesting
> comment in section 13.15.3 of 24547212.pdf which explains the strange
> behaviour we're seeing.
Hm, 13.16.3 in my version, but indeed: all logical processors should
be put asleep in the same way ;)
>
> > I know this is not P4 specific, but motherboard specific, but do
> > you know of modules which use motherboard specific knowledge to scale
> > the processor?
> No.
> > If the BIOS can do it, so should we be able to do it.
>
> Dynamic frequency scaling is (probably) way different from setting a
> frequency at boot (which is what the BIOS does). Timing issues, settling
> times, etc. are way too complicated, AFAICS. Even trying to do this might
> result in severe non-recoverable hardware failures.
Probably true for some motherboards, but Asus got a WinXP program
called 'AiBooster' which is a program to under/overclock from -50% to
+33% runtime (butt-ugly UI can be seen in
http://www.asuscom.de/pub/ASUS/mb/sock478/p4p800/AIBooster_u.pdf). Could
Wine be used (given the right permissions) to run or disect such a
utility to make underclocking reality under Linux?
*hopeful* Or has Asus released the specification of its motherboard?
--
Rutger Nijlunsing ---------------------------- rutger ed tux tmfweb nl
never attribute to a conspiracy which can be explained by incompetence
----------------------------------------------------------------------
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2004-05-16 13:20 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-13 17:39 cpufreq and p4 prescott Dominik Brodowski
2004-05-14 21:47 ` rutger
2004-05-15 6:44 ` Dominik Brodowski
2004-05-15 10:52 ` rutger
2004-05-15 19:41 ` Dominik Brodowski
2004-05-16 13:20 ` Rutger Nijlunsing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox