linux-mediatek.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Mikko Perttunen <mikko.perttunen@kapsi.fi>
To: Brian Norris <computersforpeace@gmail.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>,
	linux-pm@vger.kernel.org, Zhang Rui <rui.zhang@intel.com>,
	Eduardo Valentin <edubezval@gmail.com>,
	linux-kernel@vger.kernel.org,
	Stephen Warren <swarren@wwwdotorg.org>,
	kernel@pengutronix.de, linux-mediatek@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 11/15] thermal: thermal: Add support for hardware-tracked trip points
Date: Tue, 19 May 2015 15:43:59 +0300	[thread overview]
Message-ID: <555B300F.8030305@kapsi.fi> (raw)
In-Reply-To: <20150518202848.GT11598@ld-irv-0074>

On 05/18/15 23:28, Brian Norris wrote:
> On Mon, May 18, 2015 at 10:13:46PM +0300, Mikko Perttunen wrote:
>> On 05/18/2015 09:44 PM, Brian Norris wrote:
>>> On Mon, May 18, 2015 at 02:09:44PM +0200, Sascha Hauer wrote:
>>>> On Mon, May 18, 2015 at 12:06:50PM +0300, Mikko Perttunen wrote:
>>>>> One interesting thing I noticed was that at least the bang-bang
>>>>> governor only acts if the temperature is properly smaller than (trip
>>>>> temp - hysteresis). So perhaps we should specify the non-tripping
>>>>> range as [low, high)? Or we could change bang-bang.
>>>>
>>>> I wonder how we can protect against such off-by-one errors anyway.
>>>> Generally a hardware might operate on raw values rather than directly
>>>> in temperature values in °C. This means a driver for this must have
>>>> celsius_to_raw and raw_to_celsius conversion functions. Now it can
>>>> happen that due to rounding errors celsius_to_raw(Tcrit) returns a raw
>>>> value that when converted back to celsius is different from the
>>>> original value in °C. This would mean the hardware triggers an interrupt
>>>> for a trip point and the thermal core does not react because get_temp
>>>> actually returns a different temperature than previously programmed as
>>>> interrupt trigger. This way we would lose hot (or cold) events.
>>>
>>> This also highlights another fact: there's a race between interrupt
>>> generation and temperature reading (->get_temp()). I would expect any
>>> hardware interrupt thermal sensor would also have a latched temperature
>>> reading to correspond with it, and there would be no guarantee that this
>>> latched temperature will match the polled reading seen once you reach
>>> thermal_zone_device_update(). So a hardware driver might report a
>>> thermal update, but the temperature reported to the core won't
>>> necessarily match what interrupt was meant for.
>>
>> Does this actually matter? The thermal core will reset trips and
>> apply cooling using the new - most recent - value. Using bang bang
>> as example, if the temperature has risen since the interrupt fired,
>> the cooling device will correctly not be switched off. If the
>> temperature has fallen, it will again be correctly switched off. The
>> only issue is then if the temperature is exactly 'trip temp - trip
>> hyst' which will cause set_trips to load the trip points below, but
>> not cause bang bang to turn off the cooling device, and the next
>> chance it will have will only be at the next below trip point. Well,
>> this is still safe (at least until you replace "cooling device" with
>> "heating device"), so maybe it isn't that big of an issue.
>>
>> Please point out if there's a problem with my line of reasoning.
>
> I'm not sure I followed exactly the reason for the low-temp/hyst corner
> case, but otherwise I guess that makes sense. The only problem IMO, is
> that you're encouraging the generation of spurious notifications; if the
> temperature is constantly changing right around 'trip temp', but it
> never settles above 'trip temp' long enough for the core to re-capture
> the high temperature scenario, you'll just keep making useless calls to
> thermal_zone_device_update(). This kind of defeats the purpose of the
> hysteresis, right?

The corner case with bang bang is as follows:
- Say we have trip points as 50C and 80C, both with 5C hysteresis, and 
these are programmed into hardware. So the actual hardware trip points 
are 45C and 80C.
- Currently the temperature is, say, 60C and the fan is turned on.
- Temperature drops to 45C, the lower trip point is triggered.
- 45C >= 50C - 5C, so the fan is not turned off.

If we said that the hysteresis was 0C, then bang bang is certainly 
correct in that if the trip point was at 50C, it shouldn't turn the fan 
off, since that is greater than or equal to the requested temperature 
for cooling.

The function you describe would certainly be useful for eliminating 
possible superfluous interrupts due to temperature wobble, though I'm 
not sure how much of a problem that even would be.

>
> I'd really rather have a high temperature interrupt generate exactly one
> notification to the core framework, and that the sensor driver can rely
> on that one interrupt being handled as a high temperature situation,
> allowing it to disable the high-temp interrupt.
>
> One of my biggest problems with the thermal subsystem so far is that
> thermal_zone_device_update() doesn't actually seem to have any specific
> semantic meaning. It just means that there was some reason to update
> something. So then, you have to reason about a particular thermal
> governor (bang bang) in order to make something sensible. If I want to
> use a different sort of user-space governor, then I have to reevaluate
> all these same assumptions, and it seems like I end up with a sub-par
> solution.

Yeah, though I'm not sure if you can ever be sure that the governor is 
fine not getting regular temperature updates, so I imagine you might 
always end up needing to pick your governors with that in mind. In 
practice, this might not be so horrible.

>
> As a side note: I have patches to extend some of the uevent information
> passed by the user-space governor too, to accomplish what I'm suggesting
> above. Perhaps that would be a better way to discuss what I'm thinking.
>
>> FWIW - at least Tegra doesn't have a latched register like this.
>> There's just a bit indicating that an interrupt was raised and a
>> temperature register that updates according to the sensor's input
>> clock.
>
> A sensor for Broadcom's BCM7xxx has a latched register. If I get the
> time, I'll post my driver soon.
>
> Brian
>


  reply	other threads:[~2015-05-19 12:43 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-13  8:52 [PATCH v3] Thermal hardware trip points and Mediatek thermal driver Sascha Hauer
2015-05-13  8:52 ` [PATCH 01/15] thermal: consistently use int for temperatures Sascha Hauer
2015-05-20  7:12   ` Mikko Perttunen
2015-05-20  8:34     ` Sascha Hauer
2015-05-20  8:42       ` Mikko Perttunen
2015-05-13  8:52 ` [PATCH 02/15] thermal: trivial: fix typo in comment Sascha Hauer
2015-05-13  8:52 ` [PATCH 03/15] thermal: remove useless call to thermal_zone_device_set_polling Sascha Hauer
2015-05-13  8:52 ` [PATCH 04/15] thermal: Use IS_ENABLED instead of #ifdef Sascha Hauer
2015-05-13  8:52 ` [PATCH 05/15] thermal: Add comment explaining test for critical temperature Sascha Hauer
2015-05-20  7:18   ` Mikko Perttunen
2015-05-13  8:52 ` [PATCH 06/15] thermal: inline only once used function Sascha Hauer
2015-05-20  7:28   ` Mikko Perttunen
2015-05-13  8:52 ` [PATCH 07/15] thermal: streamline get_trend callbacks Sascha Hauer
2015-05-13  8:52 ` [PATCH 08/15] thermal: Allow sensor ops to fail with -ENOSYS Sascha Hauer
2015-05-13  8:52 ` [PATCH 09/15] thermal: of: always set sensor related callbacks Sascha Hauer
2015-05-13  8:52 ` [PATCH 10/15] thermal: Make struct thermal_zone_device_ops const Sascha Hauer
2015-05-13  8:52 ` [PATCH 11/15] thermal: thermal: Add support for hardware-tracked trip points Sascha Hauer
2015-05-18  9:06   ` Mikko Perttunen
2015-05-18 12:09     ` Sascha Hauer
2015-05-18 18:44       ` Brian Norris
2015-05-18 19:13         ` Mikko Perttunen
2015-05-18 20:28           ` Brian Norris
2015-05-19 12:43             ` Mikko Perttunen [this message]
2015-05-19 14:05         ` Sascha Hauer
2015-05-19 13:58       ` Sascha Hauer
2015-05-19 14:05         ` Mikko Perttunen
2015-05-20 13:21           ` Sascha Hauer
2015-05-13  8:52 ` [PATCH 12/15] thermal: of: implement .set_trips for device tree thermal zones Sascha Hauer
2015-05-18  9:09   ` Mikko Perttunen
2015-05-13  8:52 ` [PATCH 13/15] dt-bindings: thermal: Add binding document for Mediatek thermal controller Sascha Hauer
2015-05-13  8:52 ` [PATCH 14/15] thermal: Add Mediatek thermal controller support Sascha Hauer
2015-05-14  9:52   ` Paul Bolle
     [not found]     ` <CABicQ-XxXbMMnYyFAst=Xk1HMOXc6N1-J1bvyutMFY_=iNd0fg@mail.gmail.com>
2015-05-15  6:12       ` Sascha Hauer
     [not found] ` <1431507163-19933-1-git-send-email-s.hauer-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
2015-05-13  8:52   ` [PATCH 15/15] ARM64: dts: mt8173: Add thermal/auxadc device nodes Sascha Hauer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555B300F.8030305@kapsi.fi \
    --to=mikko.perttunen@kapsi.fi \
    --cc=computersforpeace@gmail.com \
    --cc=edubezval@gmail.com \
    --cc=kernel@pengutronix.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=s.hauer@pengutronix.de \
    --cc=swarren@wwwdotorg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).