All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhang Rui <rui.zhang@intel.com>
To: Keerthy <j-keerthy@ti.com>, Eduardo Valentin <edubezval@gmail.com>
Cc: linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-omap@vger.kernel.org, nm@ti.com, t-kristo@ti.com
Subject: Re: [PATCH] thermal: core: Add a back up thermal shutdown mechanism
Date: Wed, 12 Apr 2017 11:20:48 +0800	[thread overview]
Message-ID: <1491967248.2357.25.camel@intel.com> (raw)
In-Reply-To: <f64632d5-c780-e5fe-cff7-8ed1459348a0@ti.com>

On Wed, 2017-04-12 at 08:19 +0530, Keerthy wrote:
> 
> On Tuesday 11 April 2017 10:59 PM, Eduardo Valentin wrote:
> > 
> > Hey,
> > 
> > On Fri, Mar 31, 2017 at 12:00:20PM +0530, Keerthy wrote:
> > > 
> > > orderly_poweroff is triggered when a graceful shutdown
> > > of system is desired. This may be used in many critical states of
> > > the
> > > kernel such as when subsystems detects conditions such as
> > > critical
> > > temperature conditions. However, in certain conditions in system
> > > boot up sequences like those in the middle of driver probes being
> > > initiated, userspace will be unable to power off the system in a
> > > clean
> > > manner and leaves the system in a critical state. In cases like
> > > these,
> > > the /sbin/poweroff will return success (having forked off to
> > > attempt
> > > powering off the system. However, the system overall will fail to
> > > completely poweroff (since other modules will be probed) and the
> > > system
> > > is still functional with no userspace (since that would have shut
> > > itself
> > > off).
> > OK... This seams to me, still a corner case supposed to be fixed at
> > orderly_power_off, not at thermal. But..
> > 
> > > 
> > > 
> > > However, there is no clean way of detecting such failure of
> > > userspace
> > > powering off the system. In such scenarios, it is necessary for a
> > > backup
> > > workqueue to be able to force a shutdown of the system when
> > > orderly
> > > shutdown is not successful after a configurable time period.
> > > 
> > Given that system running hot is a thermal issue, I guess we care
> > more
> > on this matter then..
> Yes!
> 
I just read this thread again https://patchwork.kernel.org/patch/802458
1/ to recall the previous discussion.

https://patchwork.kernel.org/patch/8149891/
https://patchwork.kernel.org/patch/8149861/
should be the solution made based on Ingo' suggestion, right?

And to me, this sounds like the right direction to go, thermal does not
need a back up shutdown solution, it just needs a kernel function call
which guarantees the system can be shutdown/reboot immediately.

is there any reason that patch 1/2 is not accepted?

thanks,
rui
> > 
> > 
> > > 
> > > Reported-by: Nishanth Menon <nm@ti.com>
> > > Signed-off-by: Keerthy <j-keerthy@ti.com>
> > > ---
> > >  drivers/thermal/Kconfig        | 13 +++++++++++++
> > >  drivers/thermal/thermal_core.c | 42
> > > ++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 55 insertions(+)
> > > 
> > > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> > > index 0a16cf4..4cc55f9 100644
> > > --- a/drivers/thermal/Kconfig
> > > +++ b/drivers/thermal/Kconfig
> > > @@ -15,6 +15,19 @@ menuconfig THERMAL
> > >  
> > >  if THERMAL
> > >  
> > > +config THERMAL_EMERGENCY_POWEROFF_DELAY_MS
> > > +	int "Emergency poweroff delay in milli-seconds"
> > > +	depends on THERMAL
> > > +	default 0
> > > +	help
> > > +	  The number of milliseconds to delay before emergency
> > > +	  poweroff kicks in. The delay should be carefully
> > > profiled
> > > +	  so as to give adequate time for orderly_poweroff. In
> > > case
> > > +	  of failure of an orderly_poweroff the emergency
> > > poweroff
> > > +	  kicks in after the delay has elapsed and shuts down
> > > the system.
> > > +
> > > +	  If set to 0 poweroff will happen immediately.
> > > +
> > >  config THERMAL_HWMON
> > >  	bool
> > >  	prompt "Expose thermal sensors as hwmon device"
> > > diff --git a/drivers/thermal/thermal_core.c
> > > b/drivers/thermal/thermal_core.c
> > > index 11f0675..dc7fdd4 100644
> > > --- a/drivers/thermal/thermal_core.c
> > > +++ b/drivers/thermal/thermal_core.c
> > > @@ -322,6 +322,47 @@ static void handle_non_critical_trips(struct
> > > thermal_zone_device *tz,
> > >  		       def_governor->throttle(tz, trip);
> > >  }
> > >  
> > > +/**
> > > + * emergency_poweroff_func - emergency poweroff work after a
> > > known delay
> > > + * @work: work_struct associated with the emergency poweroff
> > > function
> > > + *
> > > + * This function is called in very critical situations to force
> > > + * a kernel poweroff after a configurable timeout value.
> > > + */
> > > +static void emergency_poweroff_func(struct work_struct *work)
> > > +{
> > > +	/**
> > > +	 * We have reached here after the emergency thermal
> > > shutdown
> > > +	 * Waiting period has expired. This means
> > > orderly_poweroff has
> > > +	 * not been able to shut off the system for some reason.
> > > +	 * Try to shut down the system immediately using
> > > pm_power_off
> > > +	 * if populated
> > > +	 */
> > The above is not a kernel doc entry...
> I will fix that.
> 
> > 
> > 
> > > 
> > > +	pr_warn("Attempting kernel_power_off\n");
> > > +	if (pm_power_off)
> > > +		pm_power_off();
> > Why not calling kernel_power_off() directly instead? That is what
> > is called by orderly
> > power off in case it fails, which seams to be  the missing part
> > when
> > user land returns success, and therefore we don't call
> > kernel_power_off(). That path goes through the machine_power_off(),
> > which seams to be the default for pm_power_off() anyway.
> > 
> > kernel_power_off() handles the power off system call too.
> Yes. This is after orderly_poweroff fails so i felt why go through
> kernel_power_off and directly call pm_power_off which directly pulls
> out
> the power plug. This is in dire straits situation. Hence preferred to
> call the last piece directly.
> 
> > 
> > 
> > > 
> > > +
> > > +	/**
> > not a kernel doc entry...
> Okay.
> 
> > 
> > 
> > > 
> > > +	 * Worst of the worst case trigger emergency restart
> > > +	 */
> > > +	pr_warn("kernel_power_off has failed! Attempting
> > > emergency_restart\n");
> > > +	emergency_restart();
> > > +}
> > > +
> > > +static DECLARE_DELAYED_WORK(emergency_poweroff_work,
> > > emergency_poweroff_func);
> > > +
> > > +/**
> > > + * emergency_poweroff - Trigger an emergency system poweroff
> > > + *
> > > + * This may be called from any critical situation to trigger a
> > > system shutdown
> > > + * after a known period of time. By default the delay is 0
> > > millisecond
> > > + */
> > > +void thermal_emergency_poweroff(void)
> > > +{
> > > +	schedule_delayed_work(&emergency_poweroff_work,
> > > +			      msecs_to_jiffies(CONFIG_THERMAL_EM
> > > ERGENCY_POWEROFF_DELAY_MS));
> > > +}
> > > +
> > >  static void handle_critical_trips(struct thermal_zone_device
> > > *tz,
> > >  				  int trip, enum
> > > thermal_trip_type trip_type)
> > >  {
> > > @@ -343,6 +384,7 @@ static void handle_critical_trips(struct
> > > thermal_zone_device *tz,
> > >  			  "critical temperature reached(%d
> > > C),shutting down\n",
> > >  			  tz->temperature / 1000);
> > >  		orderly_poweroff(true);
> > > +		thermal_emergency_poweroff();
> > Shouldn't we start count the timeout before calling
> > orderly_poweroff?
> Okay yes. That makes more sense. Queue the emergency function, start
> the
> countdown and immediately call the orderly_poweroff. I will fix the
> above comments and send a v2. I still want to go with pm_power_off
> over
> kernel_poweroff as we have already elapsed the time out and the first
> thing we want is to shut off the SoC! Let me know.
> 
> 
> > 
> > 
> > > 
> > >  	}
> > >  }
> > >  

  reply	other threads:[~2017-04-12  3:20 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31  6:30 [PATCH] thermal: core: Add a back up thermal shutdown mechanism Keerthy
2017-03-31  6:30 ` Keerthy
2017-04-11 17:29 ` Eduardo Valentin
2017-04-12  2:49   ` Keerthy
2017-04-12  2:49     ` Keerthy
2017-04-12  3:20     ` Zhang Rui [this message]
2017-04-12  3:39       ` Keerthy
2017-04-12  3:39         ` Keerthy
2017-04-12  4:05         ` Eduardo Valentin
2017-04-12  4:18           ` Keerthy
2017-04-12  4:18             ` Keerthy
2017-04-12  7:55           ` Keerthy
2017-04-12  7:55             ` Keerthy
2017-04-12  8:26             ` Zhang Rui
2017-04-12  8:36               ` Keerthy
2017-04-12  8:36                 ` Keerthy
2017-04-12  8:45                 ` Zhang Rui
2017-04-12 15:44                   ` Eduardo Valentin
2017-04-12 16:16                     ` Keerthy
2017-04-12 16:16                       ` Keerthy
2017-04-12 16:50                       ` Eduardo Valentin
2017-04-12 16:31                     ` Grygorii Strashko
2017-04-12 16:31                       ` Grygorii Strashko
2017-04-12 16:34                       ` Eduardo Valentin
2017-04-12 16:44                       ` Keerthy
2017-04-12 16:44                         ` Keerthy
2017-04-12 16:54                         ` Eduardo Valentin
2017-04-12 17:07                           ` Keerthy
2017-04-12 17:07                             ` Keerthy
2017-04-12 17:08                         ` Grygorii Strashko
2017-04-12 17:08                           ` Grygorii Strashko
2017-04-12 17:11                           ` Keerthy
2017-04-12 17:11                             ` Keerthy
2017-04-12 17:24                             ` Eduardo Valentin
2017-04-12 18:43                               ` Tero Kristo
2017-04-12 18:43                                 ` Tero Kristo
2017-04-13  3:50                                 ` Keerthy
2017-04-13  3:50                                   ` Keerthy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1491967248.2357.25.camel@intel.com \
    --to=rui.zhang@intel.com \
    --cc=edubezval@gmail.com \
    --cc=j-keerthy@ti.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=nm@ti.com \
    --cc=t-kristo@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.