From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mason Subject: Using a temperature sensor with 1-bit output for CPU throttling Date: Tue, 28 Apr 2015 13:27:01 +0200 Message-ID: <553F6E85.5090402@free.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: Received: from smtp2-g21.free.fr ([212.27.42.2]:13378 "EHLO smtp2-g21.free.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965244AbbD1L1G (ORCPT ); Tue, 28 Apr 2015 07:27:06 -0400 Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: Linux PM Cc: cpufreq , Zhang Rui , Eduardo Valentin Hello everyone, The SoC I'm working on provides a temperature sensor (NXP) in the CPU b= lock. The sensor seems to be very primitive, so I wanted to ask experienced p= eople what would be the best way to use it from Linux. General Description "The sensor generates an output signal that indicates if the die temper= ature exceeds a programmable threshold. This makes it particularly suitable f= or detecting overheating." So it seems that the original purpose of this sensor was to periodicall= y check that the temperature has not exceeded a given threshold. - Is the CPU temp higher than 100=C2=B0C ? - No. - OK. Business as usual. (1 second later) - Is the CPU temp higher than 100=C2=B0C ? - Yes. - Uh-oh! I need to do something about it. Basic Functions "The temp sensor uses a bandgap type of circuit to compare a voltage wh= ich has a negative temperature coefficient with a voltage that is proportio= nal to absolute temperature. A resistor bank allows 40 different temperatur= e thresholds to be selected and the logic output 'out_temperature' will t= hen indicate whether the actual die temperature lies above or below the sel= ected threshold." The available thresholds seem to be chosen somewhat arbitrarily: -45.1, -39.7, -33.7, -29.4, -24.4, -20.4, -15.4, -10.1, -6.4, -1.4, 3.6, 7.6, 12.9, 16.6, 20.6, 25.6, 30.9, 34.9, 38.6, 43.9, 48.9, 52.9, 57.9, 61.9, 66.9, 70.9, 76.3, 81.3, 85.3, 90.3, 95.3, 98.9, 102.9, 108.3, 111.9, 117.3, 122.3, 126.3, 131.3, 135.3, 139.3 The spacing between values seems arbitrary also. (Is there an underlying physical explanation?) I'm not sure that there is much point in testing for temperatures lower than 50=C2=B0C ? (I'm told that the SoC can reliably function up to 125= =C2=B0C.) Do higher temperatures shorten the lifespan of a component? In other words, would a CPU running 24/7 at 100=C2=B0C "break" sooner than one running 24/7 at 50=C2=B0C ? Characteristics Symbol Parameter Min Typ Max Unit (Operating conditions) Tjunc Junction temperature -40 25 125 =C2=B0C Vdd Supply voltage 1.0 1.1 1.26 V (Normal operating mode) Idd Supply current 50 60 =CE=BCA Vbandgapref Ref output voltage 0.72 0.8 0.88 V =E2=88=86outtemp Absolute Temp =C2=B12 =C2=B110 =C2= =B0C threshold error T_res Temp resolution 3 4.5 7 =C2=B0C Given the semantics of the temperature sensor hardware block, I was tempted to implement something along these lines: Create a kernel thread that runs periodically (e.g. every second) to check if the temperature is above 100=C2=B0C. - If not, do nothing - If yes, somehow prevent the CPU from using the highest frequencies defined in cpufreq's freq table (They are 1000, 500, 333, 200, 100 MHz) Is that a sensible approach? Is there a way to implement this using the thermal framework? Or am I looking at this wrong, and things should be done a different way? (I'm using 3.14 by the way.) I suppose I could perform some kind of binary search to zoom in on the current threshold (although it might change during the measurements, so I'd rather not go there.) Regards.