linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
To: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>,
	Zhang Rui <rui.zhang@intel.com>,
	amitk@kernel.org, "open list:THERMAL" <linux-pm@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] thermal: core: Add indication for userspace usage
Date: Mon, 30 Nov 2020 10:13:08 -0800	[thread overview]
Message-ID: <585bb5d3ee5bea063795682108576c3464ba72b6.camel@linux.intel.com> (raw)
In-Reply-To: <34348B03-5E27-49A0-A704-6332BAC00758@canonical.com>

On Tue, 2020-12-01 at 02:04 +0800, Kai-Heng Feng wrote:
> > On Dec 1, 2020, at 00:19, Srinivas Pandruvada <
> > srinivas.pandruvada@linux.intel.com> wrote:
> > 
> > On Mon, 2020-11-30 at 16:23 +0800, Kai-Heng Feng wrote:
> > > > On Nov 30, 2020, at 15:57, Daniel Lezcano <
> > > > daniel.lezcano@linaro.org> wrote:
> > > > 
> > > > 
> > > > [Added Srinivas]
> > > > 
> > > > On 28/11/2020 18:54, Kai-Heng Feng wrote:
> > > > > We are seeing thermal shutdown on Intel based mobile
> > > > > workstations, the
> > > > > shutdown happens during the first trip handle in
> > > > > thermal_zone_device_register():
> > > > > kernel: thermal thermal_zone15: critical temperature reached
> > > > > (101
> > > > > C), shutting down
> > > > > 
> > > > > However, we shouldn't do a thermal shutdown here, since
> > > > > 1) We may want to use a dedicated daemon, Intel's thermald in
> > > > > this case,
> > > > > to handle thermal shutdown.
> > > > > 
> > > > > 2) For ACPI based system, _CRT doesn't mean shutdown unless
> > > > > it's
> > > > > inside
> > > > > ThermalZone. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
> > > > > "... If this object it present under a device, the device’s
> > > > > driver
> > > > > evaluates this object to determine the device’s critical
> > > > > cooling
> > > > > temperature trip point. This value may then be used by the
> > > > > device’s
> > > > > driver to program an internal device temperature sensor trip
> > > > > point."
> > > > > 
> > > > > So a "critical trip" here merely means we should take a more
> > > > > aggressive
> > > > > cooling method.
> > > > 
> > > > Well, actually it is stated before:
> > > > 
> > > > "This object, when defined under a thermal zone, returns the
> > > > critical
> > > > temperature at which OSPM must shutdown the system".
> > > 
> > > This means specifically for the ACPI ThermalZone in AML, e.g.:
> > > 
> > > ThermalZone (TZ0) {
> > > ....
> > >    Method(_CRT) { ... }
> > > } // end of TZ0
> > > 
> > > However the device is not under any ACPI ThermalZone.
> > > 
> > > > That is what does the thermal subsystem, no ?
> > > > 
> > > > > So add an indication to let thermal core know it should leave
> > > > > thermal
> > > > > device to userspace to handle.
> > > > 
> > > > You may want to check the 'HOT' trip point and then use the
> > > > notification
> > > > mechanism to get notified in userspace and take action from
> > > > there
> > > > (eg.
> > > > offline some CPUs).
> > > 
> > > For this particular issue we are facing, the thermal shutdown
> > > happens
> > > in thermal_zone_device_register() and userspace isn't up yet.
> > 
> > What about creating an new callback
> > 
> > enum thermal_trip_status {
> > 	THERMAL_TRIP_DISABLED = 0,
> > 	THERMAL_TRIP_ENABLED,
> > };
> > 
> > int get_trip_status(struct thermal_zone_device *, int trip, enum
> > thermal_trip_status *state);
> > 
> > Then in 
> > static void handle_thermal_trip(struct thermal_zone_device *tz, int
> > trip)
> > {
> > 
> > /* before tz->ops->get_trip_temp(tz, trip, &trip_temp); */
> > if (tz->ops->get_trip_status) {
> > 	enum thermal_trip_status *status;
> > 
> > 	if (!tz->ops->get_trip_status(tz, trip, &status)) {
> > 		if (status == THERMAL_TRIP_DISABLED)
> > 			return;	
> > 	}
> > }
> > ...
> > ...
> > 
> > }
> > 
> > 
> > This callback will help the cases:
> > - Allows drivers to selectively disable certain trips during init
> > state
> > or system resume where there can be spikes or always. int340x
> > drivers
> > can disable always.
> 
> This sounds really great. This is indeed can happen on system resume,
> before userspace process thaw.
> 
> > - Still give options for drivers to handle critical trip even if
> > they
> > are bound to user space governors. User space process may be dead,
> > so
> > still allow kernel to process graceful shutdown
> 
> To make the scenario happen, do we need a new sysfs to let usespace
> enable it with THERMAL_TRIP_ENABLED?
This should be drivers call not user space.

Thanks,
Srinivas


> 
> Kai-Heng
> 
> > Thanks,
> > Srinivas
> > 
> > > Kai-Heng
> > > 
> > > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > > ---
> > > > > drivers/thermal/thermal_core.c | 3 +++
> > > > > include/linux/thermal.h        | 2 ++
> > > > > 2 files changed, 5 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/thermal/thermal_core.c
> > > > > b/drivers/thermal/thermal_core.c
> > > > > index c6d74bc1c90b..6561e3767529 100644
> > > > > --- a/drivers/thermal/thermal_core.c
> > > > > +++ b/drivers/thermal/thermal_core.c
> > > > > @@ -1477,6 +1477,9 @@ thermal_zone_device_register(const char
> > > > > *type, int trips, int mask,
> > > > > 			goto unregister;
> > > > > 	}
> > > > > 
> > > > > +	if (tz->tzp && tz->tzp->userspace)
> > > > > +		thermal_zone_device_disable(tz);
> > > > > +
> > > > > 	mutex_lock(&thermal_list_lock);
> > > > > 	list_add_tail(&tz->node, &thermal_tz_list);
> > > > > 	mutex_unlock(&thermal_list_lock);
> > > > > diff --git a/include/linux/thermal.h
> > > > > b/include/linux/thermal.h
> > > > > index d07ea27e72a9..e8e8fac78fc8 100644
> > > > > --- a/include/linux/thermal.h
> > > > > +++ b/include/linux/thermal.h
> > > > > @@ -247,6 +247,8 @@ struct thermal_zone_params {
> > > > > 	 */
> > > > > 	bool no_hwmon;
> > > > > 
> > > > > +	bool userspace;
> > > > > +
> > > > > 	int num_tbps;	/* Number of tbp entries */
> > > > > 	struct thermal_bind_params *tbp;
> > > > > 
> > > > > 
> > > > 
> > > > -- 
> > > > <http://www.linaro.org/> Linaro.org │ Open source software for
> > > > ARM
> > > > SoCs
> > > > 
> > > > Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook
> > > > |
> > > > <http://twitter.com/#!/linaroorg> Twitter |
> > > > <http://www.linaro.org/linaro-blog/> Blog


  reply	other threads:[~2020-11-30 18:15 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28 17:54 [PATCH 1/3] thermal: core: Add indication for userspace usage Kai-Heng Feng
2020-11-28 17:54 ` [PATCH 2/3] thermal: int340x: Indicate " Kai-Heng Feng
2020-11-30  5:29   ` Srinivas Pandruvada
2020-11-30  5:46     ` Kai-Heng Feng
2020-11-28 17:54 ` [PATCH 3/3] thermal: intel: intel_pch_thermal: " Kai-Heng Feng
2020-11-30  7:57 ` [PATCH 1/3] thermal: core: Add indication for " Daniel Lezcano
2020-11-30  8:23   ` Kai-Heng Feng
2020-11-30 16:19     ` Srinivas Pandruvada
2020-11-30 18:04       ` Kai-Heng Feng
2020-11-30 18:13         ` Srinivas Pandruvada [this message]
2020-11-30 18:22           ` Kai-Heng Feng
2020-11-30 18:39             ` Srinivas Pandruvada
2020-12-07  5:36               ` Kai-Heng Feng
2020-12-09  9:30                 ` Daniel Lezcano
2020-12-09 16:10                   ` Srinivas Pandruvada
  -- strict thread matches above, loose matches on Subject: below --
2020-11-30  5:36 Kai-Heng Feng
2020-12-14 18:21 ` Matthew Garrett
2020-12-15 12:49   ` Kai-Heng Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585bb5d3ee5bea063795682108576c3464ba72b6.camel@linux.intel.com \
    --to=srinivas.pandruvada@linux.intel.com \
    --cc=amitk@kernel.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rui.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).