All of lore.kernel.org
 help / color / mirror / Atom feed
* cpufreq stops working after a while
@ 2006-08-11 18:25 Mark Lord
  2006-08-11 18:39 ` Dave Jones
  2006-08-11 18:46 ` Andrew Morton
  0 siblings, 2 replies; 39+ messages in thread
From: Mark Lord @ 2006-08-11 18:25 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Andrew Morton

One of my notebooks (Dell Latitude X1) has a 1.1GHz Pentium-M ULV processor.
This chip can change CPU speeds from 600 -> 800 -> 1100 Mhz.

I use speedstep-centrino with it, and after boot all is usually okay.
But after a few hours of operation, it stops shifting to the highest frequency
even under continuous 100% load (or not).  Eventually it gets stuck at 600Mhz
and stays there until I reboot.

Sometimes rebooting doesn't even restore it.

/sys/devices/system/cpu/cpu0/cpufreq is all very normal looking,
showing the available frequencies and other info.  All of the attribs
there look fine, except for "scaling_max_freq", which is what seems
to gradually get set smaller.  For instance, right now it is set to 800000,
and it won't let me change it (echo 11000000 > scaling_max_freq has no effect.

WHY?  And how can I fix it?


^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: cpufreq stops working after a while
@ 2006-08-11 19:55 Pallipadi, Venkatesh
  2006-08-11 20:29 ` Mark Lord
  0 siblings, 1 reply; 39+ messages in thread
From: Pallipadi, Venkatesh @ 2006-08-11 19:55 UTC (permalink / raw)
  To: Mark Lord, Dave Jones, Linux Kernel, Andrew Morton



>-----Original Message-----
>From: linux-kernel-owner@vger.kernel.org 
>[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Mark Lord
>Sent: Friday, August 11, 2006 12:41 PM
>To: Dave Jones; Linux Kernel; Andrew Morton
>Subject: Re: cpufreq stops working after a while
>
>Dave Jones wrote:
>> 
>> boot with cpufreq.debug=7, and capture dmesg output after it fails
>> to transition.  This might be another manifestation of the mysterious
>> "highest frequency isnt accessable" bug, that seems to come from
>> some recent change in acpi.
>
>booting with that option doesn't seem to give me any new messages
>in dmesg (or /var/log/messages).  I also tried editing cpufreq.c
>and hardcoding debug = 7 on the variable declaration.
>Still no new messages.
>
>??

You also need to configure in CONFIG_CPU_FREQ_DEBUG for the parameter to
take effect.

Thanks,
Venki

^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: cpufreq stops working after a while
@ 2006-08-11 21:08 Pallipadi, Venkatesh
  0 siblings, 0 replies; 39+ messages in thread
From: Pallipadi, Venkatesh @ 2006-08-11 21:08 UTC (permalink / raw)
  To: Mark Lord, Dave Jones; +Cc: Linux Kernel, Andrew Morton

 

>-----Original Message-----
>From: Mark Lord [mailto:lkml@rtr.ca] 
>Sent: Friday, August 11, 2006 1:30 PM
>To: Dave Jones
>Cc: Pallipadi, Venkatesh; Linux Kernel; Andrew Morton
>Subject: Re: cpufreq stops working after a while
>
>Pallipadi, Venkatesh wrote:
>>> Dave Jones wrote:
>>>> boot with cpufreq.debug=7, and capture dmesg output after it fails
>>>> to transition.  This might be another manifestation of the 
>mysterious
>>>> "highest frequency isnt accessable" bug, that seems to come from
>>>> some recent change in acpi.
>..
>> You also need to configure in CONFIG_CPU_FREQ_DEBUG
>
>Thanks, Venki!
>
>Okay, here's the tail end of the trace, in which (search for "max")
>one can see the top frequency limit being downgraded.
>
>But, by whom, and why ??
>And what's with these requests for oddball frequencies ("685714"),
>or is that just normal approximation within the governor?
>
>
>[  853.228000] cpufreq-core: updating policy for CPU 0
>[  853.228000] cpufreq-core: Warning: CPU frequency out of 
>sync: cpufreq and timing core thinks of 1100000, is 800000 kHz.
>[  853.228000] cpufreq-core: notification 0 of frequency 
>transition to 800000 kHz
>[  853.228000] userspace: saving cpu_cur_freq of cpu 0 to be 800000 kHz
>[  853.228000] cpufreq-core: notification 1 of frequency 
>transition to 800000 kHz
>[  853.228000] cpufreq-core: scaling loops_per_jiffy to 
>3195840 for frequency 800000 kHz
>[  853.228000] userspace: saving cpu_cur_freq of cpu 0 to be 800000 kHz
>[  853.228000] cpufreq-core: setting new policy for CPU 0: 
>600000 - 1100000 kHz
>[  853.228000] freq-table: request for verification of policy 
>(600000 - 1100000 kHz) for cpu 0
>[  853.228000] freq-table: verification lead to (600000 - 
>1100000 kHz) for cpu 0
>[  853.228000] freq-table: request for verification of policy 
>(600000 - 800000 kHz) for cpu 0
>[  853.228000] freq-table: verification lead to (600000 - 
>800000 kHz) for cpu 0
>[  853.228000] cpufreq-core: new min and max freqs are 600000 
>- 800000 kHz
>[  853.228000] cpufreq-core: governor: change or update limits
>[  853.228000] cpufreq-core: __cpufreq_governor for CPU 0, event 3

Looks like there are thermal events happening that is causing CPU limits
to reduce. Are you running anything on the CPU when this happens. Is
there a thermal interface in /proc/acpi that can give you the current
temperature of the system?

Thanks,
Venki

^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: cpufreq stops working after a while
@ 2006-08-11 21:38 Pallipadi, Venkatesh
  2006-08-11 21:53 ` Mark Lord
  0 siblings, 1 reply; 39+ messages in thread
From: Pallipadi, Venkatesh @ 2006-08-11 21:38 UTC (permalink / raw)
  To: Mark Lord; +Cc: Dave Jones, Linux Kernel, Andrew Morton

 

>-----Original Message-----
>From: Mark Lord [mailto:lkml@rtr.ca] 
>Sent: Friday, August 11, 2006 2:25 PM
>To: Pallipadi, Venkatesh
>Cc: Dave Jones; Linux Kernel; Andrew Morton
>Subject: Re: cpufreq stops working after a while
>
>Mark Lord wrote:
>>
>>> Venki wrote:
>>> Looks like there are thermal events happening that is 
>causing CPU limits
>>> to reduce. Are you running anything on the CPU when this happens. Is
>>> there a thermal interface in /proc/acpi that can give you 
>the current
>>> temperature of the system?
>> 
>> There are thermal thingies in /proc, and I'm watching the temperature
>> value from there (62C --> 65C), and the trip_points value is 95C..
>> 
>> Think it's thermal?
>
>Yup, thermal.
>Trips shortly after I see 66C in 
>/proc/acpi/thermal_zone/THM/temperature
>
>If I stop number crunching for a bit, the temperature drops down to the
>low 50's, and the max freq then gets set back to 1100.
>
>Mmmm.. is there a way to control the high/low thermostat values there?
>
>Cheers

What is the "cooling mode" you have in
/proc/acpi/thermal_zone/THM/cooling_mode.
Output of all files in that directory will help.

Thanks,
Venki

^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: cpufreq stops working after a while
@ 2006-08-11 22:18 Pallipadi, Venkatesh
  0 siblings, 0 replies; 39+ messages in thread
From: Pallipadi, Venkatesh @ 2006-08-11 22:18 UTC (permalink / raw)
  To: Mark Lord; +Cc: Dave Jones, Linux Kernel, Andrew Morton

 

>-----Original Message-----
>From: Mark Lord [mailto:lkml@rtr.ca] 
>Sent: Friday, August 11, 2006 2:54 PM
>To: Pallipadi, Venkatesh
>Cc: Dave Jones; Linux Kernel; Andrew Morton
>Subject: Re: cpufreq stops working after a while
>
>Pallipadi, Venkatesh wrote:
>>> Mark Lord wrote:
>>> Yup, thermal.
>>> Trips shortly after I see 66C in 
>>> /proc/acpi/thermal_zone/THM/temperature
>>>
>>> If I stop number crunching for a bit, the temperature drops 
>down to the
>>> low 50's, and the max freq then gets set back to 1100.
>>>
>>> Mmmm.. is there a way to control the high/low thermostat 
>values there?
>..
>> What is the "cooling mode" you have in
>> /proc/acpi/thermal_zone/THM/cooling_mode.
>> Output of all files in that directory will help.
>
>/proc/acpi/thermal_zone/THM/cooling_mode:
>	<setting not supported>
>	cooling mode:   critical
>
>/proc/acpi/thermal_zone/THM/polling_frequency:
>	<polling disabled>
>
>/proc/acpi/thermal_zone/THM/state:
>	state:                   ok
>
>/proc/acpi/thermal_zone/THM/temperature:
>	temperature:             49 C
>
>/proc/acpi/thermal_zone/THM/trip_points:
>	critical (S5):           95 C
>
>==========
>
>This is a passively cooled notebook, so there's no fan
>to control.  They probably self-limit the CPU speed when
>the temperature gets high to prevent meltdown of the drive.
>
>But I would like to raise the lower limit if possible,
>allowing the speed to bump back up at, say 58C rather
>than waiting for 52C as it currently does.
>
>??

Passive cooling starting temperature is given by platform manufacturer
through BIOS. You can check whether your BIOS has any option to change
it. Changing it manually by custom DSDT etc may be risky :).
One thing you can try from software is the polling_frequency above. For
some reason it is set to zero above. Try setting it to 1 sec and see
whether that makes any difference (echo 1 >
/proc/acpi/thermal_zone/THM/polling_frequency).

Venki

^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: cpufreq stops working after a while
@ 2006-08-15 13:27 Pallipadi, Venkatesh
  2006-08-15 15:07 ` Carlos Garcia Campos
  2006-08-16 19:28 ` Len Brown
  0 siblings, 2 replies; 39+ messages in thread
From: Pallipadi, Venkatesh @ 2006-08-15 13:27 UTC (permalink / raw)
  To: Carlos Garcia Campos, cpufreq

 

>-----Original Message-----
>From: cpufreq-bounces@lists.linux.org.uk 
>[mailto:cpufreq-bounces@lists.linux.org.uk] On Behalf Of 
>Carlos Garcia Campos
>Sent: Tuesday, August 15, 2006 4:07 AM
>To: cpufreq@lists.linux.org.uk
>Subject: Re: cpufreq stops working after a while
>
>El mar, 15-08-2006 a las 09:49 +0200, Thomas Renninger escribió:
>> On Sat, 2006-08-12 at 10:52 +0200, Erik Slagter wrote:
>> > On Fri, 11 Aug 2006 14:25:26 -0400 Mark Lord <lkml@rtr.ca> wrote:
>> > 
>> > > One of my notebooks (Dell Latitude X1) has a 1.1GHz 
>Pentium-M ULV processor.
>> > > This chip can change CPU speeds from 600 -> 800 -> 1100 Mhz.
>> > > 
>> > > I use speedstep-centrino with it, and after boot all is 
>usually okay.
>> > > But after a few hours of operation, it stops shifting to 
>the highest frequency
>> > > even under continuous 100% load (or not).  Eventually it 
>gets stuck at 600Mhz
>> > > and stays there until I reboot.
>
>I have the same problem. My laptop is Dell Latitude D600 (Intel(R)
>Pentium(R) M processor 1.60GHz). If I'm compiling something, for
>example, that takes a long time, scaling_max_freq is set to 600000 (the
>lowest). If I try to echo 1600000 to scaling_max_freq it do nothing.
>Only after some time if the cpu load is not high I can echoing 1600000
>again and it works without need to reboot. 
>

Looks like you have the same problem that Mark had in this original thread. Thermal.
It is not a bug in cpufreq. Just that due to cpu load, system is getting heated up and platform decides to reduce the temperature using passive cooling and as a result reduces the frequency. Does your system have active cooling (fans) or does it allow only passive cooling? You can monitor the temperature by looking at stuff under /proc/acpi/termal_zone/*/*.

Thanks,
Venki

^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: cpufreq stops working after a while
@ 2006-08-15 15:23 Pallipadi, Venkatesh
  2006-08-15 17:46 ` Carlos Garcia Campos
  0 siblings, 1 reply; 39+ messages in thread
From: Pallipadi, Venkatesh @ 2006-08-15 15:23 UTC (permalink / raw)
  To: Carlos Garcia Campos; +Cc: cpufreq

 

>-----Original Message-----
>From: Carlos Garcia Campos [mailto:carlosgc@gnome.org] 
>Sent: Tuesday, August 15, 2006 8:08 AM
>To: Pallipadi, Venkatesh
>Cc: cpufreq@lists.linux.org.uk
>Subject: RE: cpufreq stops working after a while
>
>El mar, 15-08-2006 a las 06:27 -0700, Pallipadi, Venkatesh escribió:
>>  
>> >-----Original Message-----
>> >From: cpufreq-bounces@lists.linux.org.uk 
>> >[mailto:cpufreq-bounces@lists.linux.org.uk] On Behalf Of 
>> >Carlos Garcia Campos
>> >Sent: Tuesday, August 15, 2006 4:07 AM
>> >To: cpufreq@lists.linux.org.uk
>> >Subject: Re: cpufreq stops working after a while
>> >
>> >
>> >I have the same problem. My laptop is Dell Latitude D600 (Intel(R)
>> >Pentium(R) M processor 1.60GHz). If I'm compiling something, for
>> >example, that takes a long time, scaling_max_freq is set to 
>600000 (the
>> >lowest). If I try to echo 1600000 to scaling_max_freq it do nothing.
>> >Only after some time if the cpu load is not high I can 
>echoing 1600000
>> >again and it works without need to reboot. 
>> >
>> 
>> Looks like you have the same problem that Mark had in this 
>original thread. Thermal.
>
>It never happened with older kernels (< 2.6.17, I think)
>

Can you confirm the latest version of the kernel where the problem was not there. That will help on narrowing this down.

>> It is not a bug in cpufreq. Just that due to cpu load, 
>system is getting heated up and platform decides to reduce the 
>temperature using passive cooling and as a result reduces the 
>frequency. Does your system have active cooling (fans) or does 
>it allow only passive cooling? You can monitor the temperature 
>by looking at stuff under /proc/acpi/termal_zone/*/*.
>
>Yes, my system has fans. Here is the contents of the files
>under /proc/acpi/termal_zone/*/*, if it helps:
>
>$ cat /proc/acpi/thermal_zone/THM/*
><setting not supported>
>cooling mode:   critical
><polling disabled>
>state:                   ok
>temperature:             47 C
>critical (S5):           102 C
>
>How can I solve the problem then? It's very annoying. 


Can you watch the temperature as you see the frequency drop. Continuously (every second) cat cpufreq_max_freq in /sys and temperature in /proc as you run you load. My feeling is you will see the drop in max freq as your temperature goes to around 60 degrees or so.


Thanks,
Venki

^ permalink raw reply	[flat|nested] 39+ messages in thread
* RE: cpufreq stops working after a while
@ 2006-08-16 13:27 Pallipadi, Venkatesh
  2006-08-16 18:19 ` Carlos Garcia Campos
  0 siblings, 1 reply; 39+ messages in thread
From: Pallipadi, Venkatesh @ 2006-08-16 13:27 UTC (permalink / raw)
  To: Carlos Garcia Campos, cpufreq

 

>-----Original Message-----
>From: cpufreq-bounces@lists.linux.org.uk 
>[mailto:cpufreq-bounces@lists.linux.org.uk] On Behalf Of 
>Carlos Garcia Campos
>Sent: Wednesday, August 16, 2006 3:11 AM
>To: cpufreq@lists.linux.org.uk
>Subject: RE: cpufreq stops working after a while
>
>El mar, 15-08-2006 a las 19:46 +0200, Carlos Garcia Campos escribió:
>> El mar, 15-08-2006 a las 08:23 -0700, Pallipadi, Venkatesh escribió:
>> >  
>> > 
>> > Can you confirm the latest version of the kernel where the 
>problem was not there. That will help on narrowing this down.
>> 
>> I'm not sure at all . . . I don't have any kernel < 2.6.17 compiled
>> right now.
>> 
>> > >> It is not a bug in cpufreq. Just that due to cpu load, 
>> > >system is getting heated up and platform decides to reduce the 
>> > >temperature using passive cooling and as a result reduces the 
>> > >frequency. Does your system have active cooling (fans) or does 
>> > >it allow only passive cooling? You can monitor the temperature 
>> > >by looking at stuff under /proc/acpi/termal_zone/*/*.
>> > >
>> > >Yes, my system has fans. Here is the contents of the files
>> > >under /proc/acpi/termal_zone/*/*, if it helps:
>> > >
>> > >$ cat /proc/acpi/thermal_zone/THM/*
>> > ><setting not supported>
>> > >cooling mode:   critical
>> > ><polling disabled>
>> > >state:                   ok
>> > >temperature:             47 C
>> > >critical (S5):           102 C
>> > >
>> > >How can I solve the problem then? It's very annoying. 
>> > 
>> > 
>> > Can you watch the temperature as you see the frequency 
>drop. Continuously (every second) cat cpufreq_max_freq in /sys 
>and temperature in /proc as you run you load. My feeling is 
>you will see the drop in max freq as your temperature goes to 
>around 60 degrees or so.
>> 
>> Here are the results:
>> 
>> ................
>> 1600000 - 85 C
>> 1600000 - 84 C
>> 1600000 - 85 C
>> 1600000 - 76 C
>> 600000 - 76 C
>> 600000 - 71 C
>> 600000 - 70 C
>> 600000 - 69 C
>> ................
>> 
>> It changed at 76 C.
>
>I forgot to mention that if I boot from battery scaling_max_freq is set
>to 600000 and I have to echo 1600000. At boot time temperature is not
>high so I'm not sure it's a thermal problem, or at least not only a
>thermal problem. 
>

That looks like a different problem. It may be a policy being set by some userland daemon/startup script. Enable CPU_FREQ_DEBUG and boot with boot parameter cpufreq.debug=7 you should see when and why max_freq is changing. Infact for the other problem as well, get the messages from debug.

One other thing you can try is changing thermal_zone polling_frequency to 1 and see whether it change the behavior when you run the workload.

Thanks,
Venki

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2006-08-24 16:15 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-11 18:25 cpufreq stops working after a while Mark Lord
2006-08-11 18:39 ` Dave Jones
2006-08-11 19:41   ` Mark Lord
2006-08-11 20:01     ` Mark Lord
2006-08-11 20:12       ` Dave Jones
2006-08-11 18:46 ` Andrew Morton
2006-08-11 19:01   ` Mark Lord
2006-08-11 19:01     ` Mark Lord
2006-08-11 19:10   ` Mark Lord
2006-08-11 19:18     ` Andrew Morton
2006-08-12  8:52   ` Erik Slagter
2006-08-15  7:49     ` Thomas Renninger
2006-08-15 11:07       ` Carlos Garcia Campos
  -- strict thread matches above, loose matches on Subject: below --
2006-08-11 19:55 Pallipadi, Venkatesh
2006-08-11 20:29 ` Mark Lord
2006-08-11 20:39   ` Mark Lord
2006-08-11 21:01     ` Dave Jones
2006-08-11 21:09       ` Mark Lord
2006-08-11 21:15       ` Mark Lord
2006-08-11 21:17         ` Mark Lord
2006-08-11 21:25         ` Mark Lord
2006-08-18 15:11           ` Pavel Machek
2006-08-24 14:44             ` Mark Lord
2006-08-24 16:15               ` Matthew Garrett
2006-08-11 21:08 Pallipadi, Venkatesh
2006-08-11 21:38 Pallipadi, Venkatesh
2006-08-11 21:53 ` Mark Lord
2006-08-11 22:18 Pallipadi, Venkatesh
2006-08-15 13:27 Pallipadi, Venkatesh
2006-08-15 15:07 ` Carlos Garcia Campos
2006-08-16 19:28 ` Len Brown
2006-08-15 15:23 Pallipadi, Venkatesh
2006-08-15 17:46 ` Carlos Garcia Campos
2006-08-16 10:10   ` Carlos Garcia Campos
2006-08-16 13:27 Pallipadi, Venkatesh
2006-08-16 18:19 ` Carlos Garcia Campos
2006-08-17 10:46   ` Thomas Renninger
2006-08-17 10:58     ` Carlos Garcia Campos
2006-08-17 15:28   ` Thomas Renninger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.