* [PATCH v4 1/3] dt-bindings: thermal-zones: Document critical-action @ 2023-08-29 12:09 Fabio Estevam 2023-08-29 12:09 ` [PATCH v4 2/3] reboot: Introduce hw_protection_reboot() Fabio Estevam 2023-08-29 12:09 ` [PATCH v4 3/3] thermal: thermal_core: Allow rebooting after critical temp Fabio Estevam 0 siblings, 2 replies; 4+ messages in thread From: Fabio Estevam @ 2023-08-29 12:09 UTC (permalink / raw) To: daniel.lezcano Cc: rafael, amitk, rui.zhang, linux-pm, krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree, Fabio Estevam, Krzysztof Kozlowski From: Fabio Estevam <festevam@denx.de> Document the critical-action property to describe the thermal action the OS should perform after the critical temperature is reached. The possible values are "shutdown" and "reboot". The motivation for introducing the critical-action property is that different systems may need different thermal actions when the critical temperature is reached. For example, a desktop PC may want the OS to trigger a shutdown when the critical temperature is reached. However, in some embedded cases, such behavior does not suit well, as the board may be unattended in the field and rebooting may be a better approach. The bootloader may also benefit from this new property as it can check the SoC temperature and in case the temperature is above the critical point, it can trigger a shutdown or reboot accordingly. Signed-off-by: Fabio Estevam <festevam@denx.de> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> --- Changes since v3: - Explain why this property is needed. (Krzysztof) - Added Krzysztof's Reviewed-by tag. .../devicetree/bindings/thermal/thermal-zones.yaml | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml index 4f3acdc4dec0..c2e4d28f885b 100644 --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml @@ -75,6 +75,15 @@ patternProperties: framework and assumes that the thermal sensors in this zone support interrupts. + critical-action: + $ref: /schemas/types.yaml#/definitions/string + description: + The action the OS should perform after the critical temperature is reached. + + enum: + - shutdown + - reboot + thermal-sensors: $ref: /schemas/types.yaml#/definitions/phandle-array maxItems: 1 -- 2.34.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v4 2/3] reboot: Introduce hw_protection_reboot() 2023-08-29 12:09 [PATCH v4 1/3] dt-bindings: thermal-zones: Document critical-action Fabio Estevam @ 2023-08-29 12:09 ` Fabio Estevam 2023-08-29 12:09 ` [PATCH v4 3/3] thermal: thermal_core: Allow rebooting after critical temp Fabio Estevam 1 sibling, 0 replies; 4+ messages in thread From: Fabio Estevam @ 2023-08-29 12:09 UTC (permalink / raw) To: daniel.lezcano Cc: rafael, amitk, rui.zhang, linux-pm, krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree, Fabio Estevam From: Fabio Estevam <festevam@denx.de> Introduce hw_protection_reboot() to trigger an emergency reboot. It is a counterpart of hw_protection_shutdown() with the difference that it will force a reboot instead of shutdown. The motivation for doing this is to allow the thermal subystem to trigger a reboot when the temperature reaches the critical temperature. Signed-off-by: Fabio Estevam <festevam@denx.de> --- Changes since v3: - None include/linux/reboot.h | 1 + kernel/reboot.c | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 2b6bb593be5b..4a319bc24f6a 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -174,6 +174,7 @@ void ctrl_alt_del(void); extern void orderly_poweroff(bool force); extern void orderly_reboot(void); +void hw_protection_reboot(const char *reason, int ms_until_forced); void hw_protection_shutdown(const char *reason, int ms_until_forced); /* diff --git a/kernel/reboot.c b/kernel/reboot.c index 3bba88c7ffc6..05333ae8bc6b 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -952,6 +952,40 @@ static void hw_failure_emergency_poweroff(int poweroff_delay_ms) msecs_to_jiffies(poweroff_delay_ms)); } +/** + * hw_protection_reboot - Trigger an emergency system reboot + * + * @reason: Reason of emergency reboot to be printed. + * @ms_until_forced: Time to wait for orderly reboot before tiggering a + * forced reboot. Negative value disables the forced + * reboot. + * + * Initiate an emergency system reboot in order to protect hardware from + * further damage. Usage examples include a thermal protection. + * + * NOTE: The request is ignored if protection reboot is already pending even + * if the previous request has given a large timeout for forced reboot. + * Can be called from any context. + */ +void hw_protection_reboot(const char *reason, int ms_until_forced) +{ + static atomic_t allow_proceed = ATOMIC_INIT(1); + + pr_emerg("HARDWARE PROTECTION reboot (%s)\n", reason); + + /* Reboot should be initiated only once. */ + if (!atomic_dec_and_test(&allow_proceed)) + return; + + /* + * Queue a backup emergency reboot in the event of + * orderly_reboot failure + */ + hw_failure_emergency_poweroff(ms_until_forced); + orderly_reboot(); +} +EXPORT_SYMBOL_GPL(hw_protection_reboot); + /** * hw_protection_shutdown - Trigger an emergency system poweroff * -- 2.34.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v4 3/3] thermal: thermal_core: Allow rebooting after critical temp 2023-08-29 12:09 [PATCH v4 1/3] dt-bindings: thermal-zones: Document critical-action Fabio Estevam 2023-08-29 12:09 ` [PATCH v4 2/3] reboot: Introduce hw_protection_reboot() Fabio Estevam @ 2023-08-29 12:09 ` Fabio Estevam 2023-08-29 18:31 ` Rafael J. Wysocki 1 sibling, 1 reply; 4+ messages in thread From: Fabio Estevam @ 2023-08-29 12:09 UTC (permalink / raw) To: daniel.lezcano Cc: rafael, amitk, rui.zhang, linux-pm, krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree, Fabio Estevam From: Fabio Estevam <festevam@denx.de> Currently, the default mechanism is to trigger a shutdown after the critical temperature is reached. In some embedded cases, such behavior does not suit well, as the board may be unattended in the field and rebooting may be a better approach. The bootloader may also check the temperature and only allow the boot to proceed when the temperature is below a certain threshold. Introduce support for allowing a reboot to be triggered after the critical temperature is reached. If the "critical-action" devicetree property is not found, fall back to the shutdown action to preserve the existing default behavior. Tested on a i.MX8MM board with the following devicetree changes: thermal-zones { cpu-thermal { critical-action = "reboot"; }; }; Signed-off-by: Fabio Estevam <festevam@denx.de> --- Changes since v3: - None. drivers/thermal/thermal_core.c | 8 +++++++- drivers/thermal/thermal_of.c | 27 +++++++++++++++++++++++++++ include/linux/thermal.h | 6 ++++++ 3 files changed, 40 insertions(+), 1 deletion(-) diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index a59700593d32..f69e1667acb1 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -320,11 +320,17 @@ void thermal_zone_device_critical(struct thermal_zone_device *tz) * Its a must for forced_emergency_poweroff_work to be scheduled. */ int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS; + void (*hw_protection_action)(const char *reason, int ms_until_forced); dev_emerg(&tz->device, "%s: critical temperature reached, " "shutting down\n", tz->type); - hw_protection_shutdown("Temperature too high", poweroff_delay_ms); + hw_protection_action = hw_protection_shutdown; + + if (tz->action == THERMAL_CRITICAL_ACTION_REBOOT) + hw_protection_action = hw_protection_reboot; + + hw_protection_action("Temperature too high", poweroff_delay_ms); } EXPORT_SYMBOL(thermal_zone_device_critical); diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c index 4ca905723429..8bc28cba7406 100644 --- a/drivers/thermal/thermal_of.c +++ b/drivers/thermal/thermal_of.c @@ -218,6 +218,31 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int return tz; } +static const char * const critical_actions[] = { + [THERMAL_CRITICAL_ACTION_SHUTDOWN] = "shutdown", + [THERMAL_CRITICAL_ACTION_REBOOT] = "reboot", +}; + +static void thermal_of_get_critical_action(struct device_node *np, + enum thermal_action *action) +{ + const char *action_string; + int i, ret; + + ret = of_property_read_string(np, "critical-action", &action_string); + if (ret < 0) + goto out_default_action; + + for (i = 0; i < ARRAY_SIZE(critical_actions); i++) + if (!strcasecmp(action_string, critical_actions[i])) { + *action = i; + return; + } + +out_default_action: + *action = THERMAL_CRITICAL_ACTION_SHUTDOWN; +} + static int thermal_of_monitor_init(struct device_node *np, int *delay, int *pdelay) { int ret; @@ -516,6 +541,8 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node * goto out_kfree_trips; } + thermal_of_get_critical_action(np, &tz->action); + ret = thermal_zone_device_enable(tz); if (ret) { pr_err("Failed to enabled thermal zone '%s', id=%d: %d\n", diff --git a/include/linux/thermal.h b/include/linux/thermal.h index b449a46766f5..08854f640db9 100644 --- a/include/linux/thermal.h +++ b/include/linux/thermal.h @@ -34,6 +34,11 @@ struct thermal_cooling_device; struct thermal_instance; struct thermal_attr; +enum thermal_action { + THERMAL_CRITICAL_ACTION_SHUTDOWN, /* shutdown when crit temperature is reached */ + THERMAL_CRITICAL_ACTION_REBOOT, /* reboot when crit temperature is reached */ +}; + enum thermal_trend { THERMAL_TREND_STABLE, /* temperature is stable */ THERMAL_TREND_RAISING, /* temperature is raising */ @@ -187,6 +192,7 @@ struct thermal_zone_device { struct list_head node; struct delayed_work poll_queue; enum thermal_notify_event notify_event; + enum thermal_action action; }; /** -- 2.34.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v4 3/3] thermal: thermal_core: Allow rebooting after critical temp 2023-08-29 12:09 ` [PATCH v4 3/3] thermal: thermal_core: Allow rebooting after critical temp Fabio Estevam @ 2023-08-29 18:31 ` Rafael J. Wysocki 0 siblings, 0 replies; 4+ messages in thread From: Rafael J. Wysocki @ 2023-08-29 18:31 UTC (permalink / raw) To: Fabio Estevam Cc: daniel.lezcano, rafael, amitk, rui.zhang, linux-pm, krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree, Fabio Estevam On Tue, Aug 29, 2023 at 2:09 PM Fabio Estevam <festevam@gmail.com> wrote: > > From: Fabio Estevam <festevam@denx.de> > > Currently, the default mechanism is to trigger a shutdown after the > critical temperature is reached. > > In some embedded cases, such behavior does not suit well, as the board may > be unattended in the field and rebooting may be a better approach. > > The bootloader may also check the temperature and only allow the boot to > proceed when the temperature is below a certain threshold. > > Introduce support for allowing a reboot to be triggered after the > critical temperature is reached. > > If the "critical-action" devicetree property is not found, fall back to > the shutdown action to preserve the existing default behavior. > > Tested on a i.MX8MM board with the following devicetree changes: > > thermal-zones { > cpu-thermal { > critical-action = "reboot"; > }; > }; > > Signed-off-by: Fabio Estevam <festevam@denx.de> > --- > Changes since v3: > - None. > > drivers/thermal/thermal_core.c | 8 +++++++- > drivers/thermal/thermal_of.c | 27 +++++++++++++++++++++++++++ > include/linux/thermal.h | 6 ++++++ > 3 files changed, 40 insertions(+), 1 deletion(-) > > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > index a59700593d32..f69e1667acb1 100644 > --- a/drivers/thermal/thermal_core.c > +++ b/drivers/thermal/thermal_core.c > @@ -320,11 +320,17 @@ void thermal_zone_device_critical(struct thermal_zone_device *tz) > * Its a must for forced_emergency_poweroff_work to be scheduled. > */ > int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS; > + void (*hw_protection_action)(const char *reason, int ms_until_forced); > > dev_emerg(&tz->device, "%s: critical temperature reached, " > "shutting down\n", tz->type); > > - hw_protection_shutdown("Temperature too high", poweroff_delay_ms); > + hw_protection_action = hw_protection_shutdown; > + > + if (tz->action == THERMAL_CRITICAL_ACTION_REBOOT) > + hw_protection_action = hw_protection_reboot; > + > + hw_protection_action("Temperature too high", poweroff_delay_ms); Why not define static const char *msg = "Temperature too high"; and then if (tz->action == THERMAL_CRITICAL_ACTION_REBOOT) hw_protection_reboot(msg, poweroff_delay_ms); else hw_protection_shutdown((msg, poweroff_delay_ms); > } > EXPORT_SYMBOL(thermal_zone_device_critical); > > diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c > index 4ca905723429..8bc28cba7406 100644 > --- a/drivers/thermal/thermal_of.c > +++ b/drivers/thermal/thermal_of.c > @@ -218,6 +218,31 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int > return tz; > } > > +static const char * const critical_actions[] = { > + [THERMAL_CRITICAL_ACTION_SHUTDOWN] = "shutdown", > + [THERMAL_CRITICAL_ACTION_REBOOT] = "reboot", > +}; > + > +static void thermal_of_get_critical_action(struct device_node *np, > + enum thermal_action *action) > +{ > + const char *action_string; > + int i, ret; > + > + ret = of_property_read_string(np, "critical-action", &action_string); > + if (ret < 0) > + goto out_default_action; > + > + for (i = 0; i < ARRAY_SIZE(critical_actions); i++) > + if (!strcasecmp(action_string, critical_actions[i])) { > + *action = i; > + return; > + } > + > +out_default_action: > + *action = THERMAL_CRITICAL_ACTION_SHUTDOWN; > +} > + > static int thermal_of_monitor_init(struct device_node *np, int *delay, int *pdelay) > { > int ret; > @@ -516,6 +541,8 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node * > goto out_kfree_trips; > } > > + thermal_of_get_critical_action(np, &tz->action); > + > ret = thermal_zone_device_enable(tz); > if (ret) { > pr_err("Failed to enabled thermal zone '%s', id=%d: %d\n", > diff --git a/include/linux/thermal.h b/include/linux/thermal.h > index b449a46766f5..08854f640db9 100644 > --- a/include/linux/thermal.h > +++ b/include/linux/thermal.h > @@ -34,6 +34,11 @@ struct thermal_cooling_device; > struct thermal_instance; > struct thermal_attr; > > +enum thermal_action { > + THERMAL_CRITICAL_ACTION_SHUTDOWN, /* shutdown when crit temperature is reached */ THERMAL_CRITICAL_ACTION_SHUTDOWN = 0, so it is clear what will happen on non-DT platforms. > + THERMAL_CRITICAL_ACTION_REBOOT, /* reboot when crit temperature is reached */ > +}; > + > enum thermal_trend { > THERMAL_TREND_STABLE, /* temperature is stable */ > THERMAL_TREND_RAISING, /* temperature is raising */ > @@ -187,6 +192,7 @@ struct thermal_zone_device { > struct list_head node; > struct delayed_work poll_queue; > enum thermal_notify_event notify_event; > + enum thermal_action action; > }; > > /** > -- ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-08-29 18:31 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-29 12:09 [PATCH v4 1/3] dt-bindings: thermal-zones: Document critical-action Fabio Estevam 2023-08-29 12:09 ` [PATCH v4 2/3] reboot: Introduce hw_protection_reboot() Fabio Estevam 2023-08-29 12:09 ` [PATCH v4 3/3] thermal: thermal_core: Allow rebooting after critical temp Fabio Estevam 2023-08-29 18:31 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).