From: Eduardo Valentin <edubezval@gmail.com>
To: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Zhang Rui <rui.zhang@intel.com>, Keerthy <j-keerthy@ti.com>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-omap@vger.kernel.org, nm@ti.com, t-kristo@ti.com
Subject: Re: [PATCH] thermal: core: Add a back up thermal shutdown mechanism
Date: Wed, 12 Apr 2017 09:34:24 -0700 [thread overview]
Message-ID: <20170412163422.GA13484@localhost.localdomain> (raw)
In-Reply-To: <b565f2c9-fdd7-7525-da91-695f113e631b@ti.com>
[-- Attachment #1: Type: text/plain, Size: 4281 bytes --]
Hey,
On Wed, Apr 12, 2017 at 11:31:18AM -0500, Grygorii Strashko wrote:
>
>
> On 04/12/2017 10:44 AM, Eduardo Valentin wrote:
> > Hello,
> >
> ...
>
> >
> > I agree. But there it nothing that says it is not reenterable. If you
> > saw something in this line, can you please share?
> >
> >>>> will you generate a patch to do this?
> >>> Sure. I will generate a patch to take care of 1) To make sure that
> >>> orderly_poweroff is called only once right away. I have already
> >>> tested.
> >>>
> >>> for 2) Cancel all the scheduled work queues to monitor the
> >>> temperature.
> >>> I will take some more time to make it and test.
> >>>
> >>> Is that okay? Or you want me to send both together?
> >>>
> >> I think you can send patch for step 1 first.
> >
> > I am happy to see that Keerthy found the problem with his setup and a
> > possible solution. But I have a few concerns here.
> >
> > 1. If regular shutdown process takes 10seconds, that is a ballpark that
> > thermal should never wait. orderly_poweroff() calls run_cmd() with wait
> > flag set. That means, if regular userland shutdown takes 10s, we are
> > waiting for it. Obviously this not acceptable. Specially if you setup
> > critical trip to be 125C. Now, if you properly size the critical trip to
> > fire before hotspot really reach 125C, for 10s (or the time it takes to
> > shutdown), then fine. But based on what was described in this thread,
> > his system is waiting 10s on regular shutdown, and his silicon is on
> > out-of-spec temperature for 10s, which is wrong.
> >
> > 2. The above scenario is not acceptable in a long run, specially from a
> > reliability perspective. If orderly_poweroff() has a possibility to
> > simply never return (or take too long), I would say the thermal
> > subsystem is using the wrong API.
> >
>
>
> Hh, I do not see that orderly_poweroff() will wait for anything now:
> void orderly_poweroff(bool force)
> {
> if (force) /* do not override the pending "true" */
> poweroff_force = true;
> schedule_work(&poweroff_work);
> ^^^^^^^ async call. even here can be pretty big delay if system is under pressure
> }
>
>
> static int __orderly_poweroff(bool force)
> {
> int ret;
>
> ret = run_cmd(poweroff_cmd);
> ^^^^ no wait for the process - only for exec. flags == UMH_WAIT_EXEC
Yeah, and that is what I really meant. Sorry for the confusion. The exec
is problematic in his scenario too, given he is running on a very
interesting NFS setup. Yes, the WAIT_EXEC is set:
392 static int run_cmd(const char *cmd)
393 {
394 char **argv;
395 static char *envp[] = {
396 "HOME=/",
397 "PATH=/sbin:/bin:/usr/sbin:/usr/bin",
398 NULL
399 };
400 int ret;
401 argv = argv_split(GFP_KERNEL, cmd, NULL);
402 if (argv) {
403 ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
404 argv_free(argv);
405 } else {
406 ret = -ENOMEM;
407 }
408
409 return ret;
410 }
411
>
> if (ret && force) {
> pr_warn("Failed to start orderly shutdown: forcing the issue\n");
>
> /*
> * I guess this should try to kick off some daemon to sync and
> * poweroff asap. Or not even bother syncing if we're doing an
> * emergency shutdown?
> */
> emergency_sync();
> kernel_power_off();
> ^^^ force power off, but only if run_cmd() failed - for example /sbin/poweroff doesn't exist
> }
>
> return ret;
> }
>
> static bool poweroff_force;
>
> static void poweroff_work_func(struct work_struct *work)
> {
> __orderly_poweroff(poweroff_force);
> }
>
> As result thermal has no control of power off any more after calling orderly_poweroff() and can get the result
> of US poweroff binary execution.
>
> >
> > If you are going to implement the above two patches, keep in mind:
> > i. At least within the thermal subsystem, you need to take care of all
> > zones that could trigger a shutdown.
> > ii. serializing the calls to orderly_poweroff() seams to be more
> > concerning than cancelling all monitoring.
> >
> >
>
> --
> regards,
> -grygorii
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2017-04-12 16:34 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-31 6:30 [PATCH] thermal: core: Add a back up thermal shutdown mechanism Keerthy
2017-03-31 6:30 ` Keerthy
2017-04-11 17:29 ` Eduardo Valentin
2017-04-12 2:49 ` Keerthy
2017-04-12 2:49 ` Keerthy
2017-04-12 3:20 ` Zhang Rui
2017-04-12 3:39 ` Keerthy
2017-04-12 3:39 ` Keerthy
2017-04-12 4:05 ` Eduardo Valentin
2017-04-12 4:18 ` Keerthy
2017-04-12 4:18 ` Keerthy
2017-04-12 7:55 ` Keerthy
2017-04-12 7:55 ` Keerthy
2017-04-12 8:26 ` Zhang Rui
2017-04-12 8:36 ` Keerthy
2017-04-12 8:36 ` Keerthy
2017-04-12 8:45 ` Zhang Rui
2017-04-12 15:44 ` Eduardo Valentin
2017-04-12 16:16 ` Keerthy
2017-04-12 16:16 ` Keerthy
2017-04-12 16:50 ` Eduardo Valentin
2017-04-12 16:31 ` Grygorii Strashko
2017-04-12 16:31 ` Grygorii Strashko
2017-04-12 16:34 ` Eduardo Valentin [this message]
2017-04-12 16:44 ` Keerthy
2017-04-12 16:44 ` Keerthy
2017-04-12 16:54 ` Eduardo Valentin
2017-04-12 17:07 ` Keerthy
2017-04-12 17:07 ` Keerthy
2017-04-12 17:08 ` Grygorii Strashko
2017-04-12 17:08 ` Grygorii Strashko
2017-04-12 17:11 ` Keerthy
2017-04-12 17:11 ` Keerthy
2017-04-12 17:24 ` Eduardo Valentin
2017-04-12 18:43 ` Tero Kristo
2017-04-12 18:43 ` Tero Kristo
2017-04-13 3:50 ` Keerthy
2017-04-13 3:50 ` Keerthy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170412163422.GA13484@localhost.localdomain \
--to=edubezval@gmail.com \
--cc=grygorii.strashko@ti.com \
--cc=j-keerthy@ti.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-omap@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=nm@ti.com \
--cc=rui.zhang@intel.com \
--cc=t-kristo@ti.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.