* [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
@ 2024-05-27 12:41 Geert Uytterhoeven
2024-05-27 12:41 ` [PATCH/RFC 1/3] earlycon: Export clock and PM Domain info from FDT Geert Uytterhoeven
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Geert Uytterhoeven @ 2024-05-27 12:41 UTC (permalink / raw)
To: Ulf Hansson, Greg Kroah-Hartman, Jiri Slaby, Rafael J . Wysocki,
Rob Herring, Saravana Kannan
Cc: Claudiu Beznea, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel, Geert Uytterhoeven
Hi all,
Since commit a47cf07f60dcb02d ("serial: core: Call
device_set_awake_path() for console port"), the serial driver properly
handles the case where the serial console is part of the awake path, and
it looked like we could start removing special serial console handling
from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
Unfortunately the devil is in the details, as usual...
Earlycon relies on the serial port to be initialized by the firmware
and/or bootloader. Linux is not aware of any hardware dependencies that
must be met to keep the port working, and thus cannot guarantee they
stay met, until the full serial driver takes over.
E.g. all unused clocks and unused PM Domains are disabled in a late
initcall. As this happens after the full serial driver has taken over,
the serial port's clock and/or PM Domain are no longer deemed unused,
and this is typically not a problem.
However, if the serial port's clock or PM Domain is shared with another
device, and that other device is runtime-suspended before the full
serial driver has probed, the serial port's clock and/or PM Domain will
be disabled inadvertently. Any subsequent serial console output will
cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
ports share their PM Domain with several other I/O devices. After the
use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
before the full serial driver takes over, the PM Domain containing the
early serial port is powered down, causing a lock-up when booted with
"earlycon".
This RFC patch series aims to provide a mechanism for handling this, and
to fix it for the PM Domain case:
1. The first patch provides a mechanism to let the clock and/or PM
Domain subsystem or drivers handle this, by exporting the clock and
PM Domain dependencies for the serial port, as available in the
system's device tree,
2. The second patch introduces a new flag to handle a PM domain that
must be kept powered-on during early boot, and by setting this flag
if the PM Domain contains the serial console (originally I handled
this inside rmobile-sysc, but it turned out to be easy to
generalize this to other platforms in the core PM Domain code).
3. The third patch removes the no longer needed special console
handling from the R-Mobile SYSC PM Domain driver.
I did not fix the similar clock issue, as it is more complex (there can
be multiple clocks, and each clock provider can have its own value of
#clock-cells), and I do not need it for Renesas ARM platforms.
This has been tested on the APE6-EVM, Armadillo-800-EVA, and KZM-A9-GT
development boards, with and without earlycon, including s2ram with and
without no_console_suspend.
Notes:
- This should not be needed on RZ/G3S, where each serial port device
has its own PM Domain,
- drivers/clk/imx/clk.c and drivers/pmdomain/imx/scu-pd.c have special
handling for the of_stdout device, but is probably not affected, as
each serial port seems to share its PM Domain only with the serial
port's clock controller.
Thanks for your comments!
Geert Uytterhoeven (3):
earlycon: Export clock and PM Domain info from FDT
pmdomain: core: Avoid earlycon power-down
pmdomain: renesas: rmobile-sysc: Remove serial console handling
drivers/pmdomain/core.c | 24 ++++++++++++++++--
drivers/pmdomain/renesas/rmobile-sysc.c | 33 +------------------------
drivers/tty/serial/earlycon.c | 14 ++++++++++-
include/linux/pm_domain.h | 4 +++
include/linux/serial_core.h | 10 ++++++++
5 files changed, 50 insertions(+), 35 deletions(-)
--
2.34.1
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH/RFC 1/3] earlycon: Export clock and PM Domain info from FDT
2024-05-27 12:41 [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
@ 2024-05-27 12:41 ` Geert Uytterhoeven
2024-05-29 9:01 ` Geert Uytterhoeven
2024-05-27 12:41 ` [PATCH/RFC 2/3] pmdomain: core: Avoid earlycon power-down Geert Uytterhoeven
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Geert Uytterhoeven @ 2024-05-27 12:41 UTC (permalink / raw)
To: Ulf Hansson, Greg Kroah-Hartman, Jiri Slaby, Rafael J . Wysocki,
Rob Herring, Saravana Kannan
Cc: Claudiu Beznea, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel, Geert Uytterhoeven
Earlycon relies on the serial port to be initialized by the firmware
and/or bootloader. Linux is not aware of any hardware dependencies that
must be met to keep the port working, and thus cannot guarantee they
stay met, until the full serial driver takes over.
E.g. all unused clocks and unused PM Domains are disabled in a late
initcall. As this happens after the full serial driver has taken over,
the serial port's clock and/or PM Domain are no longer deemed unused,
and this is typically not a problem.
However, if the serial port's clock or PM Domain is shared with another
device, and that other device is runtime-suspended before the full
serial driver has probed, the serial port's clock and/or PM Domain will
be disabled inadvertently. Any subsequent serial console output will
cause a crash or system lock-up.
Provide a mechanism to let the clock and/or PM Domain subsystem or
drivers handle this, by exporting the clock and PM Domain dependencies
for the serial port, as available in the system's device tree.
Note that as this is done during early boot-up, the device_node
structure pointing to the earlycon console is not yet created, so this
has to resort to raw property data.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
drivers/tty/serial/earlycon.c | 14 +++++++++++++-
include/linux/serial_core.h | 10 ++++++++++
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/earlycon.c b/drivers/tty/serial/earlycon.c
index a5fbb6ed38aed681..abe4831d9685e2b8 100644
--- a/drivers/tty/serial/earlycon.c
+++ b/drivers/tty/serial/earlycon.c
@@ -250,11 +250,14 @@ early_param("earlycon", param_setup_earlycon);
#ifdef CONFIG_OF_EARLY_FLATTREE
+const __be32 *earlycon_clocks, *earlycon_power_domains;
+int earlycon_clocks_ncells, earlycon_power_domains_ncells;
+
int __init of_setup_earlycon(const struct earlycon_id *match,
unsigned long node,
const char *options)
{
- int err;
+ int err, size;
struct uart_port *port = &early_console_dev.port;
const __be32 *val;
bool big_endian;
@@ -309,6 +312,15 @@ int __init of_setup_earlycon(const struct earlycon_id *match,
if (val)
port->uartclk = be32_to_cpu(*val);
+ earlycon_clocks = of_get_flat_dt_prop(node, "clocks", &size);
+ if (earlycon_clocks)
+ earlycon_clocks_ncells = size / sizeof(u32);
+
+ earlycon_power_domains = of_get_flat_dt_prop(node, "power-domains",
+ &size);
+ if (earlycon_power_domains)
+ earlycon_power_domains_ncells = size / sizeof(u32);
+
if (options) {
early_console_dev.baud = simple_strtoul(options, NULL, 0);
strscpy(early_console_dev.options, options,
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 8cb65f50e830c8d4..70689a3363951dac 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -954,6 +954,16 @@ static const bool earlycon_acpi_spcr_enable EARLYCON_USED_OR_UNUSED;
static inline int setup_earlycon(char *buf) { return 0; }
#endif
+#ifdef CONFIG_OF_EARLY_FLATTREE
+extern const __be32 *earlycon_clocks, *earlycon_power_domains;
+extern int earlycon_clocks_ncells, earlycon_power_domains_ncells;
+#else
+#define earlycon_clocks NULL
+#define earlycon_clocks_ncells 0
+#define earlycon_power_domains NULL
+#define earlycon_power_domains_ncells 0
+#endif
+
/* Variant of uart_console_registered() when the console_list_lock is held. */
static inline bool uart_console_registered_locked(struct uart_port *port)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH/RFC 2/3] pmdomain: core: Avoid earlycon power-down
2024-05-27 12:41 [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
2024-05-27 12:41 ` [PATCH/RFC 1/3] earlycon: Export clock and PM Domain info from FDT Geert Uytterhoeven
@ 2024-05-27 12:41 ` Geert Uytterhoeven
2024-05-27 12:41 ` [PATCHPATCH 3/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Geert Uytterhoeven @ 2024-05-27 12:41 UTC (permalink / raw)
To: Ulf Hansson, Greg Kroah-Hartman, Jiri Slaby, Rafael J . Wysocki,
Rob Herring, Saravana Kannan
Cc: Claudiu Beznea, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel, Geert Uytterhoeven
If the earlycon serial port's PM Domain is shared with another device,
and that other device is runtime-suspended before the full serial driver
has overtaken earlycon, the serial port's PM Domain will be disabled
inadvertently. Any subsequent serial console output will cause a crash
or system lock-up.
Avoid this by introducing a new flag to handle a PM domain that must be
kept powered-on during early boot, and by setting this flag if the PM
Domain contains the serial console.
Note that the PM Domain can still be powered off later, when the serial
port's power state agrees, e.g. during s2ram without no_console_suspend.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
drivers/pmdomain/core.c | 24 ++++++++++++++++++++++--
include/linux/pm_domain.h | 4 ++++
2 files changed, 26 insertions(+), 2 deletions(-)
diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
index 342779464c0d7e84..97b9b50257eb354c 100644
--- a/drivers/pmdomain/core.c
+++ b/drivers/pmdomain/core.c
@@ -22,6 +22,7 @@
#include <linux/export.h>
#include <linux/cpu.h>
#include <linux/debugfs.h>
+#include <linux/serial_core.h>
#define GENPD_RETRY_MAX_MS 250 /* Approximate */
@@ -129,6 +130,7 @@ static const struct genpd_lock_ops genpd_spin_ops = {
#define genpd_is_cpu_domain(genpd) (genpd->flags & GENPD_FLAG_CPU_DOMAIN)
#define genpd_is_rpm_always_on(genpd) (genpd->flags & GENPD_FLAG_RPM_ALWAYS_ON)
#define genpd_is_opp_table_fw(genpd) (genpd->flags & GENPD_FLAG_OPP_TABLE_FW)
+#define genpd_is_early_on(genpd) (genpd->flags & GENPD_FLAG_EARLY_ON)
static inline bool irq_safe_dev_in_sleep_domain(struct device *dev,
const struct generic_pm_domain *genpd)
@@ -725,8 +727,9 @@ static int genpd_power_off(struct generic_pm_domain *genpd, bool one_dev_on,
* (2) When the domain has a subdomain being powered on.
*/
if (genpd_is_always_on(genpd) ||
- genpd_is_rpm_always_on(genpd) ||
- atomic_read(&genpd->sd_count) > 0)
+ (genpd_is_early_on(genpd) && system_state < SYSTEM_RUNNING) ||
+ genpd_is_rpm_always_on(genpd) ||
+ atomic_read(&genpd->sd_count) > 0)
return -EBUSY;
/*
@@ -2367,6 +2370,10 @@ int of_genpd_add_provider_simple(struct device_node *np,
genpd->dev.of_node = np;
+ if (earlycon_power_domains &&
+ np->phandle == be32_to_cpup(earlycon_power_domains))
+ genpd->flags |= GENPD_FLAG_EARLY_ON;
+
/* Parse genpd OPP table */
if (!genpd_is_opp_table_fw(genpd) && genpd->set_performance_state) {
ret = dev_pm_opp_of_add_table(&genpd->dev);
@@ -2447,6 +2454,19 @@ int of_genpd_add_provider_onecell(struct device_node *np,
genpd->has_provider = true;
}
+ if (earlycon_power_domains && earlycon_power_domains_ncells == 2 &&
+ np->phandle == be32_to_cpup(earlycon_power_domains)) {
+ struct of_phandle_args genpdspec = {
+ .np = np,
+ .args_count = 1,
+ .args[0] = be32_to_cpup(earlycon_power_domains + 1),
+ };
+
+ genpd = data->xlate(&genpdspec, data);
+ if (!IS_ERR(genpd))
+ genpd->flags |= GENPD_FLAG_EARLY_ON;
+ }
+
ret = genpd_add_provider(np, data->xlate, data);
if (ret < 0)
goto error;
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 772d3280d35fafa2..012d58ffc7059e0d 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -92,6 +92,9 @@ struct dev_pm_domain_list {
* GENPD_FLAG_OPP_TABLE_FW: The genpd provider supports performance states,
* but its corresponding OPP tables are not
* described in DT, but are given directly by FW.
+ *
+ * GENPD_FLAG_EARLY_ON: Instructs genpd to keep the PM domain powered
+ * on during early boot.
*/
#define GENPD_FLAG_PM_CLK (1U << 0)
#define GENPD_FLAG_IRQ_SAFE (1U << 1)
@@ -101,6 +104,7 @@ struct dev_pm_domain_list {
#define GENPD_FLAG_RPM_ALWAYS_ON (1U << 5)
#define GENPD_FLAG_MIN_RESIDENCY (1U << 6)
#define GENPD_FLAG_OPP_TABLE_FW (1U << 7)
+#define GENPD_FLAG_EARLY_ON (1U << 8)
enum gpd_status {
GENPD_STATE_ON = 0, /* PM domain is on */
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCHPATCH 3/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-05-27 12:41 [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
2024-05-27 12:41 ` [PATCH/RFC 1/3] earlycon: Export clock and PM Domain info from FDT Geert Uytterhoeven
2024-05-27 12:41 ` [PATCH/RFC 2/3] pmdomain: core: Avoid earlycon power-down Geert Uytterhoeven
@ 2024-05-27 12:41 ` Geert Uytterhoeven
2024-06-05 9:34 ` [PATCH/RFC 0/3] " Ulf Hansson
2024-06-11 4:52 ` claudiu beznea
4 siblings, 0 replies; 14+ messages in thread
From: Geert Uytterhoeven @ 2024-05-27 12:41 UTC (permalink / raw)
To: Ulf Hansson, Greg Kroah-Hartman, Jiri Slaby, Rafael J . Wysocki,
Rob Herring, Saravana Kannan
Cc: Claudiu Beznea, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel, Geert Uytterhoeven
Since commit a47cf07f60dcb02d ("serial: core: Call
device_set_awake_path() for console port"), the serial driver properly
handles the case where the serial console is part of the awake path, so
the special serial console handling can be removed from the R-Mobile
SYSC PM Domain driver.
The PM Domain containing the serial port can now be powered down when
none of its devices are in use. Before, it was only powered down during
s2ram without no_console_suspend.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
drivers/pmdomain/renesas/rmobile-sysc.c | 33 +------------------------
1 file changed, 1 insertion(+), 32 deletions(-)
diff --git a/drivers/pmdomain/renesas/rmobile-sysc.c b/drivers/pmdomain/renesas/rmobile-sysc.c
index 0b77f37787d58f48..cc1f6f8b7a746850 100644
--- a/drivers/pmdomain/renesas/rmobile-sysc.c
+++ b/drivers/pmdomain/renesas/rmobile-sysc.c
@@ -10,7 +10,6 @@
* Copyright (C) 2011 Magnus Damm
*/
#include <linux/clk/renesas.h>
-#include <linux/console.h>
#include <linux/delay.h>
#include <linux/io.h>
#include <linux/iopoll.h>
@@ -31,8 +30,6 @@
struct rmobile_pm_domain {
struct generic_pm_domain genpd;
- struct dev_power_governor *gov;
- int (*suspend)(void);
void __iomem *base;
unsigned int bit_shift;
};
@@ -49,13 +46,6 @@ static int rmobile_pd_power_down(struct generic_pm_domain *genpd)
unsigned int mask = BIT(rmobile_pd->bit_shift);
u32 val;
- if (rmobile_pd->suspend) {
- int ret = rmobile_pd->suspend();
-
- if (ret)
- return ret;
- }
-
if (readl(rmobile_pd->base + PSTR) & mask) {
writel(mask, rmobile_pd->base + SPDCR);
@@ -98,7 +88,6 @@ static int rmobile_pd_power_up(struct generic_pm_domain *genpd)
static void rmobile_init_pm_domain(struct rmobile_pm_domain *rmobile_pd)
{
struct generic_pm_domain *genpd = &rmobile_pd->genpd;
- struct dev_power_governor *gov = rmobile_pd->gov;
genpd->flags |= GENPD_FLAG_PM_CLK | GENPD_FLAG_ACTIVE_WAKEUP;
genpd->attach_dev = cpg_mstp_attach_dev;
@@ -110,22 +99,12 @@ static void rmobile_init_pm_domain(struct rmobile_pm_domain *rmobile_pd)
__rmobile_pd_power_up(rmobile_pd);
}
- pm_genpd_init(genpd, gov ? : &simple_qos_governor, false);
-}
-
-static int rmobile_pd_suspend_console(void)
-{
- /*
- * Serial consoles make use of SCIF hardware located in this domain,
- * hence keep the power domain on if "no_console_suspend" is set.
- */
- return console_suspend_enabled ? 0 : -EBUSY;
+ pm_genpd_init(genpd, &simple_qos_governor, false);
}
enum pd_types {
PD_NORMAL,
PD_CPU,
- PD_CONSOLE,
PD_DEBUG,
PD_MEMCTL,
};
@@ -184,10 +163,6 @@ static void __init get_special_pds(void)
for_each_of_cpu_node(np)
add_special_pd(np, PD_CPU);
- /* PM domain containing console */
- if (of_stdout)
- add_special_pd(of_stdout, PD_CONSOLE);
-
/* PM domains containing other special devices */
for_each_matching_node_and_match(np, special_ids, &id)
add_special_pd(np, (uintptr_t)id->data);
@@ -227,12 +202,6 @@ static void __init rmobile_setup_pm_domain(struct device_node *np,
pd->genpd.flags |= GENPD_FLAG_ALWAYS_ON;
break;
- case PD_CONSOLE:
- pr_debug("PM domain %s contains serial console\n", name);
- pd->gov = &pm_domain_always_on_gov;
- pd->suspend = rmobile_pd_suspend_console;
- break;
-
case PD_DEBUG:
/*
* This domain contains the Coresight-ETM hardware block and
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 1/3] earlycon: Export clock and PM Domain info from FDT
2024-05-27 12:41 ` [PATCH/RFC 1/3] earlycon: Export clock and PM Domain info from FDT Geert Uytterhoeven
@ 2024-05-29 9:01 ` Geert Uytterhoeven
0 siblings, 0 replies; 14+ messages in thread
From: Geert Uytterhoeven @ 2024-05-29 9:01 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Ulf Hansson, Greg Kroah-Hartman, Jiri Slaby, Rafael J . Wysocki,
Rob Herring, Saravana Kannan, Claudiu Beznea, Peng Fan, linux-pm,
linux-serial, linux-renesas-soc, devicetree, linux-kernel
On Mon, May 27, 2024 at 2:41 PM Geert Uytterhoeven
<geert+renesas@glider.be> wrote:
> Earlycon relies on the serial port to be initialized by the firmware
> and/or bootloader. Linux is not aware of any hardware dependencies that
> must be met to keep the port working, and thus cannot guarantee they
> stay met, until the full serial driver takes over.
>
> E.g. all unused clocks and unused PM Domains are disabled in a late
> initcall. As this happens after the full serial driver has taken over,
> the serial port's clock and/or PM Domain are no longer deemed unused,
> and this is typically not a problem.
>
> However, if the serial port's clock or PM Domain is shared with another
> device, and that other device is runtime-suspended before the full
> serial driver has probed, the serial port's clock and/or PM Domain will
> be disabled inadvertently. Any subsequent serial console output will
> cause a crash or system lock-up.
>
> Provide a mechanism to let the clock and/or PM Domain subsystem or
> drivers handle this, by exporting the clock and PM Domain dependencies
> for the serial port, as available in the system's device tree.
> Note that as this is done during early boot-up, the device_node
> structure pointing to the earlycon console is not yet created, so this
> has to resort to raw property data.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> --- a/include/linux/serial_core.h
> +++ b/include/linux/serial_core.h
> @@ -954,6 +954,16 @@ static const bool earlycon_acpi_spcr_enable EARLYCON_USED_OR_UNUSED;
> static inline int setup_earlycon(char *buf) { return 0; }
> #endif
>
> +#ifdef CONFIG_OF_EARLY_FLATTREE
This should include a check for CONFIG_SERIAL_EARLYCON.
> +extern const __be32 *earlycon_clocks, *earlycon_power_domains;
> +extern int earlycon_clocks_ncells, earlycon_power_domains_ncells;
> +#else
> +#define earlycon_clocks NULL
> +#define earlycon_clocks_ncells 0
> +#define earlycon_power_domains NULL
> +#define earlycon_power_domains_ncells 0
> +#endif
> +
> /* Variant of uart_console_registered() when the console_list_lock is held. */
> static inline bool uart_console_registered_locked(struct uart_port *port)
> {
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-05-27 12:41 [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
` (2 preceding siblings ...)
2024-05-27 12:41 ` [PATCHPATCH 3/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
@ 2024-06-05 9:34 ` Ulf Hansson
2024-06-05 10:41 ` Tomi Valkeinen
2024-06-11 4:52 ` claudiu beznea
4 siblings, 1 reply; 14+ messages in thread
From: Ulf Hansson @ 2024-06-05 9:34 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Greg Kroah-Hartman, Jiri Slaby, Rafael J . Wysocki, Rob Herring,
Saravana Kannan, Claudiu Beznea, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel, Tomi Valkeinen
+ Tomi
On Mon, 27 May 2024 at 14:41, Geert Uytterhoeven
<geert+renesas@glider.be> wrote:
>
> Hi all,
>
> Since commit a47cf07f60dcb02d ("serial: core: Call
> device_set_awake_path() for console port"), the serial driver properly
> handles the case where the serial console is part of the awake path, and
> it looked like we could start removing special serial console handling
> from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
> Unfortunately the devil is in the details, as usual...
>
> Earlycon relies on the serial port to be initialized by the firmware
> and/or bootloader. Linux is not aware of any hardware dependencies that
> must be met to keep the port working, and thus cannot guarantee they
> stay met, until the full serial driver takes over.
>
> E.g. all unused clocks and unused PM Domains are disabled in a late
> initcall. As this happens after the full serial driver has taken over,
> the serial port's clock and/or PM Domain are no longer deemed unused,
> and this is typically not a problem.
>
> However, if the serial port's clock or PM Domain is shared with another
> device, and that other device is runtime-suspended before the full
> serial driver has probed, the serial port's clock and/or PM Domain will
> be disabled inadvertently. Any subsequent serial console output will
> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
> ports share their PM Domain with several other I/O devices. After the
> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
> before the full serial driver takes over, the PM Domain containing the
> early serial port is powered down, causing a lock-up when booted with
> "earlycon".
Hi Geert,
Thanks for the detailed description of the problem! As pointed out in
regards to another similar recent patch [1], this is indeed a generic
problem, not limited to the serial console handling.
At Linaro Connect a few weeks ago I followed up with Saravana from the
earlier discussions at LPC last fall. We now have a generic solution
for genpd drafted on plain paper, based on fw_devlink and the
->sync_state() callback. I am currently working on the genpd series,
while Saravana will re-spin the series (can't find the link to the
last version) for the clock framework. Ideally, we want these things
to work in a very similar way.
That said, allow me to post the series for genpd in a week or two to
see if it can solve your problem too, for the serial console.
Kind regards
Uffe
[1]
https://lore.kernel.org/linux-arm-kernel/CAPDyKFqShuq98qV5nSPzSqwLLUZ7LxLvp1eihGRBkU4qUKdWwQ@mail.gmail.com/
>
> This RFC patch series aims to provide a mechanism for handling this, and
> to fix it for the PM Domain case:
> 1. The first patch provides a mechanism to let the clock and/or PM
> Domain subsystem or drivers handle this, by exporting the clock and
> PM Domain dependencies for the serial port, as available in the
> system's device tree,
> 2. The second patch introduces a new flag to handle a PM domain that
> must be kept powered-on during early boot, and by setting this flag
> if the PM Domain contains the serial console (originally I handled
> this inside rmobile-sysc, but it turned out to be easy to
> generalize this to other platforms in the core PM Domain code).
> 3. The third patch removes the no longer needed special console
> handling from the R-Mobile SYSC PM Domain driver.
>
> I did not fix the similar clock issue, as it is more complex (there can
> be multiple clocks, and each clock provider can have its own value of
> #clock-cells), and I do not need it for Renesas ARM platforms.
I will defer to Sarvana here, but ideally his series for the clock
framework should solve this case too.
>
> This has been tested on the APE6-EVM, Armadillo-800-EVA, and KZM-A9-GT
> development boards, with and without earlycon, including s2ram with and
> without no_console_suspend.
>
> Notes:
> - This should not be needed on RZ/G3S, where each serial port device
> has its own PM Domain,
> - drivers/clk/imx/clk.c and drivers/pmdomain/imx/scu-pd.c have special
> handling for the of_stdout device, but is probably not affected, as
> each serial port seems to share its PM Domain only with the serial
> port's clock controller.
>
> Thanks for your comments!
>
> Geert Uytterhoeven (3):
> earlycon: Export clock and PM Domain info from FDT
> pmdomain: core: Avoid earlycon power-down
> pmdomain: renesas: rmobile-sysc: Remove serial console handling
>
> drivers/pmdomain/core.c | 24 ++++++++++++++++--
> drivers/pmdomain/renesas/rmobile-sysc.c | 33 +------------------------
> drivers/tty/serial/earlycon.c | 14 ++++++++++-
> include/linux/pm_domain.h | 4 +++
> include/linux/serial_core.h | 10 ++++++++
> 5 files changed, 50 insertions(+), 35 deletions(-)
>
> --
Kind regards
Uffe
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-06-05 9:34 ` [PATCH/RFC 0/3] " Ulf Hansson
@ 2024-06-05 10:41 ` Tomi Valkeinen
2024-06-05 10:53 ` Ulf Hansson
0 siblings, 1 reply; 14+ messages in thread
From: Tomi Valkeinen @ 2024-06-05 10:41 UTC (permalink / raw)
To: Ulf Hansson, Geert Uytterhoeven
Cc: Greg Kroah-Hartman, Jiri Slaby, Rafael J . Wysocki, Rob Herring,
Saravana Kannan, Claudiu Beznea, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel, Devarsh Thakkar,
Laurent Pinchart
Hi Ulf,
On 05/06/2024 12:34, Ulf Hansson wrote:
> + Tomi
>
> On Mon, 27 May 2024 at 14:41, Geert Uytterhoeven
> <geert+renesas@glider.be> wrote:
>>
>> Hi all,
>>
>> Since commit a47cf07f60dcb02d ("serial: core: Call
>> device_set_awake_path() for console port"), the serial driver properly
>> handles the case where the serial console is part of the awake path, and
>> it looked like we could start removing special serial console handling
>> from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
>> Unfortunately the devil is in the details, as usual...
>>
>> Earlycon relies on the serial port to be initialized by the firmware
>> and/or bootloader. Linux is not aware of any hardware dependencies that
>> must be met to keep the port working, and thus cannot guarantee they
>> stay met, until the full serial driver takes over.
>>
>> E.g. all unused clocks and unused PM Domains are disabled in a late
>> initcall. As this happens after the full serial driver has taken over,
>> the serial port's clock and/or PM Domain are no longer deemed unused,
>> and this is typically not a problem.
>>
>> However, if the serial port's clock or PM Domain is shared with another
>> device, and that other device is runtime-suspended before the full
>> serial driver has probed, the serial port's clock and/or PM Domain will
>> be disabled inadvertently. Any subsequent serial console output will
>> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
>> ports share their PM Domain with several other I/O devices. After the
>> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
>> before the full serial driver takes over, the PM Domain containing the
>> early serial port is powered down, causing a lock-up when booted with
>> "earlycon".
>
> Hi Geert,
>
> Thanks for the detailed description of the problem! As pointed out in
> regards to another similar recent patch [1], this is indeed a generic
> problem, not limited to the serial console handling.
>
> At Linaro Connect a few weeks ago I followed up with Saravana from the
> earlier discussions at LPC last fall. We now have a generic solution
> for genpd drafted on plain paper, based on fw_devlink and the
> ->sync_state() callback. I am currently working on the genpd series,
> while Saravana will re-spin the series (can't find the link to the
> last version) for the clock framework. Ideally, we want these things
> to work in a very similar way.
>
> That said, allow me to post the series for genpd in a week or two to
> see if it can solve your problem too, for the serial console.
Both the genpd and the clock solutions will make suppliers depend on all
their consumers to be probed, right?
I think it is a solution, and should be worked on, but it has the
drawback that suppliers that have consumers that will possibly never be
probed, will also never be able to turn off unused resources.
This was specifically the case with the TI ti-sci pmdomain case I was
looking at: the genpd driver (ti_sci_pm_domains.c) provides a lot of
genpds for totally unrelated devices, and so if, e.g., you don't have or
don't want to load a driver for the GPU, all PDs are affected.
Even here the solutions you mention will help: instead of things getting
broken because genpds get turned off while they are actually in use, the
genpds will be kept enabled, thus fixing the breakage. Unfortunately,
they'll be kept enabled forever.
I've been ill for quite a while so I haven't had the chance to look at
this more, but before that I was hacking around a bit with something I
named .partial_sync_state(). .sync_state() gets called when all the
consumers have probed, but .partial_sync_state() gets called when _a_
consumer has been probed.
For the .sync_state() things are easy for the driver, as it knows
everything related has been probed, but for .partial_sync_state() the
driver needs to track resources internally. .partial_sync_state() will
tell the driver that a consumer device has probed, the driver can then
find out which specific resources (genpds in my case) that consumer
refers to, and then... Well, that's how far I got with my hacks =).
So, I don't know if this .partial_sync_state() can even work, but I
think we do need something more on top of the .sync_state().
Tomi
>
> Kind regards
> Uffe
>
> [1]
> https://lore.kernel.org/linux-arm-kernel/CAPDyKFqShuq98qV5nSPzSqwLLUZ7LxLvp1eihGRBkU4qUKdWwQ@mail.gmail.com/
>
>>
>> This RFC patch series aims to provide a mechanism for handling this, and
>> to fix it for the PM Domain case:
>> 1. The first patch provides a mechanism to let the clock and/or PM
>> Domain subsystem or drivers handle this, by exporting the clock and
>> PM Domain dependencies for the serial port, as available in the
>> system's device tree,
>> 2. The second patch introduces a new flag to handle a PM domain that
>> must be kept powered-on during early boot, and by setting this flag
>> if the PM Domain contains the serial console (originally I handled
>> this inside rmobile-sysc, but it turned out to be easy to
>> generalize this to other platforms in the core PM Domain code).
>> 3. The third patch removes the no longer needed special console
>> handling from the R-Mobile SYSC PM Domain driver.
>>
>> I did not fix the similar clock issue, as it is more complex (there can
>> be multiple clocks, and each clock provider can have its own value of
>> #clock-cells), and I do not need it for Renesas ARM platforms.
>
> I will defer to Sarvana here, but ideally his series for the clock
> framework should solve this case too.
>
>>
>> This has been tested on the APE6-EVM, Armadillo-800-EVA, and KZM-A9-GT
>> development boards, with and without earlycon, including s2ram with and
>> without no_console_suspend.
>>
>> Notes:
>> - This should not be needed on RZ/G3S, where each serial port device
>> has its own PM Domain,
>> - drivers/clk/imx/clk.c and drivers/pmdomain/imx/scu-pd.c have special
>> handling for the of_stdout device, but is probably not affected, as
>> each serial port seems to share its PM Domain only with the serial
>> port's clock controller.
>>
>> Thanks for your comments!
>>
>> Geert Uytterhoeven (3):
>> earlycon: Export clock and PM Domain info from FDT
>> pmdomain: core: Avoid earlycon power-down
>> pmdomain: renesas: rmobile-sysc: Remove serial console handling
>>
>> drivers/pmdomain/core.c | 24 ++++++++++++++++--
>> drivers/pmdomain/renesas/rmobile-sysc.c | 33 +------------------------
>> drivers/tty/serial/earlycon.c | 14 ++++++++++-
>> include/linux/pm_domain.h | 4 +++
>> include/linux/serial_core.h | 10 ++++++++
>> 5 files changed, 50 insertions(+), 35 deletions(-)
>>
>> --
>
> Kind regards
> Uffe
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-06-05 10:41 ` Tomi Valkeinen
@ 2024-06-05 10:53 ` Ulf Hansson
2024-06-05 11:16 ` Tomi Valkeinen
0 siblings, 1 reply; 14+ messages in thread
From: Ulf Hansson @ 2024-06-05 10:53 UTC (permalink / raw)
To: Tomi Valkeinen
Cc: Geert Uytterhoeven, Greg Kroah-Hartman, Jiri Slaby,
Rafael J . Wysocki, Rob Herring, Saravana Kannan, Claudiu Beznea,
Peng Fan, linux-pm, linux-serial, linux-renesas-soc, devicetree,
linux-kernel, Devarsh Thakkar, Laurent Pinchart
On Wed, 5 Jun 2024 at 12:41, Tomi Valkeinen
<tomi.valkeinen@ideasonboard.com> wrote:
>
> Hi Ulf,
>
> On 05/06/2024 12:34, Ulf Hansson wrote:
> > + Tomi
> >
> > On Mon, 27 May 2024 at 14:41, Geert Uytterhoeven
> > <geert+renesas@glider.be> wrote:
> >>
> >> Hi all,
> >>
> >> Since commit a47cf07f60dcb02d ("serial: core: Call
> >> device_set_awake_path() for console port"), the serial driver properly
> >> handles the case where the serial console is part of the awake path, and
> >> it looked like we could start removing special serial console handling
> >> from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
> >> Unfortunately the devil is in the details, as usual...
> >>
> >> Earlycon relies on the serial port to be initialized by the firmware
> >> and/or bootloader. Linux is not aware of any hardware dependencies that
> >> must be met to keep the port working, and thus cannot guarantee they
> >> stay met, until the full serial driver takes over.
> >>
> >> E.g. all unused clocks and unused PM Domains are disabled in a late
> >> initcall. As this happens after the full serial driver has taken over,
> >> the serial port's clock and/or PM Domain are no longer deemed unused,
> >> and this is typically not a problem.
> >>
> >> However, if the serial port's clock or PM Domain is shared with another
> >> device, and that other device is runtime-suspended before the full
> >> serial driver has probed, the serial port's clock and/or PM Domain will
> >> be disabled inadvertently. Any subsequent serial console output will
> >> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
> >> ports share their PM Domain with several other I/O devices. After the
> >> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
> >> before the full serial driver takes over, the PM Domain containing the
> >> early serial port is powered down, causing a lock-up when booted with
> >> "earlycon".
> >
> > Hi Geert,
> >
> > Thanks for the detailed description of the problem! As pointed out in
> > regards to another similar recent patch [1], this is indeed a generic
> > problem, not limited to the serial console handling.
> >
> > At Linaro Connect a few weeks ago I followed up with Saravana from the
> > earlier discussions at LPC last fall. We now have a generic solution
> > for genpd drafted on plain paper, based on fw_devlink and the
> > ->sync_state() callback. I am currently working on the genpd series,
> > while Saravana will re-spin the series (can't find the link to the
> > last version) for the clock framework. Ideally, we want these things
> > to work in a very similar way.
> >
> > That said, allow me to post the series for genpd in a week or two to
> > see if it can solve your problem too, for the serial console.
>
> Both the genpd and the clock solutions will make suppliers depend on all
> their consumers to be probed, right?
>
> I think it is a solution, and should be worked on, but it has the
> drawback that suppliers that have consumers that will possibly never be
> probed, will also never be able to turn off unused resources.
>
> This was specifically the case with the TI ti-sci pmdomain case I was
> looking at: the genpd driver (ti_sci_pm_domains.c) provides a lot of
> genpds for totally unrelated devices, and so if, e.g., you don't have or
> don't want to load a driver for the GPU, all PDs are affected.
>
> Even here the solutions you mention will help: instead of things getting
> broken because genpds get turned off while they are actually in use, the
> genpds will be kept enabled, thus fixing the breakage. Unfortunately,
> they'll be kept enabled forever.
>
> I've been ill for quite a while so I haven't had the chance to look at
> this more, but before that I was hacking around a bit with something I
> named .partial_sync_state(). .sync_state() gets called when all the
> consumers have probed, but .partial_sync_state() gets called when _a_
> consumer has been probed.
>
> For the .sync_state() things are easy for the driver, as it knows
> everything related has been probed, but for .partial_sync_state() the
> driver needs to track resources internally. .partial_sync_state() will
> tell the driver that a consumer device has probed, the driver can then
> find out which specific resources (genpds in my case) that consumer
> refers to, and then... Well, that's how far I got with my hacks =).
>
> So, I don't know if this .partial_sync_state() can even work, but I
> think we do need something more on top of the .sync_state().
Thanks for the update!
You certainly have a point, but rather than implementing some platform
specific method, I think we should be able enforce the call to
->sync_state(), based upon some condition/timeout - and even if all
consumers haven't been probed.
[...]
Kind regards
Uffe
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-06-05 10:53 ` Ulf Hansson
@ 2024-06-05 11:16 ` Tomi Valkeinen
2024-06-21 1:07 ` Saravana Kannan
0 siblings, 1 reply; 14+ messages in thread
From: Tomi Valkeinen @ 2024-06-05 11:16 UTC (permalink / raw)
To: Ulf Hansson
Cc: Geert Uytterhoeven, Greg Kroah-Hartman, Jiri Slaby,
Rafael J . Wysocki, Rob Herring, Saravana Kannan, Claudiu Beznea,
Peng Fan, linux-pm, linux-serial, linux-renesas-soc, devicetree,
linux-kernel, Devarsh Thakkar, Laurent Pinchart
On 05/06/2024 13:53, Ulf Hansson wrote:
> On Wed, 5 Jun 2024 at 12:41, Tomi Valkeinen
> <tomi.valkeinen@ideasonboard.com> wrote:
>>
>> Hi Ulf,
>>
>> On 05/06/2024 12:34, Ulf Hansson wrote:
>>> + Tomi
>>>
>>> On Mon, 27 May 2024 at 14:41, Geert Uytterhoeven
>>> <geert+renesas@glider.be> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Since commit a47cf07f60dcb02d ("serial: core: Call
>>>> device_set_awake_path() for console port"), the serial driver properly
>>>> handles the case where the serial console is part of the awake path, and
>>>> it looked like we could start removing special serial console handling
>>>> from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
>>>> Unfortunately the devil is in the details, as usual...
>>>>
>>>> Earlycon relies on the serial port to be initialized by the firmware
>>>> and/or bootloader. Linux is not aware of any hardware dependencies that
>>>> must be met to keep the port working, and thus cannot guarantee they
>>>> stay met, until the full serial driver takes over.
>>>>
>>>> E.g. all unused clocks and unused PM Domains are disabled in a late
>>>> initcall. As this happens after the full serial driver has taken over,
>>>> the serial port's clock and/or PM Domain are no longer deemed unused,
>>>> and this is typically not a problem.
>>>>
>>>> However, if the serial port's clock or PM Domain is shared with another
>>>> device, and that other device is runtime-suspended before the full
>>>> serial driver has probed, the serial port's clock and/or PM Domain will
>>>> be disabled inadvertently. Any subsequent serial console output will
>>>> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
>>>> ports share their PM Domain with several other I/O devices. After the
>>>> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
>>>> before the full serial driver takes over, the PM Domain containing the
>>>> early serial port is powered down, causing a lock-up when booted with
>>>> "earlycon".
>>>
>>> Hi Geert,
>>>
>>> Thanks for the detailed description of the problem! As pointed out in
>>> regards to another similar recent patch [1], this is indeed a generic
>>> problem, not limited to the serial console handling.
>>>
>>> At Linaro Connect a few weeks ago I followed up with Saravana from the
>>> earlier discussions at LPC last fall. We now have a generic solution
>>> for genpd drafted on plain paper, based on fw_devlink and the
>>> ->sync_state() callback. I am currently working on the genpd series,
>>> while Saravana will re-spin the series (can't find the link to the
>>> last version) for the clock framework. Ideally, we want these things
>>> to work in a very similar way.
>>>
>>> That said, allow me to post the series for genpd in a week or two to
>>> see if it can solve your problem too, for the serial console.
>>
>> Both the genpd and the clock solutions will make suppliers depend on all
>> their consumers to be probed, right?
>>
>> I think it is a solution, and should be worked on, but it has the
>> drawback that suppliers that have consumers that will possibly never be
>> probed, will also never be able to turn off unused resources.
>>
>> This was specifically the case with the TI ti-sci pmdomain case I was
>> looking at: the genpd driver (ti_sci_pm_domains.c) provides a lot of
>> genpds for totally unrelated devices, and so if, e.g., you don't have or
>> don't want to load a driver for the GPU, all PDs are affected.
>>
>> Even here the solutions you mention will help: instead of things getting
>> broken because genpds get turned off while they are actually in use, the
>> genpds will be kept enabled, thus fixing the breakage. Unfortunately,
>> they'll be kept enabled forever.
>>
>> I've been ill for quite a while so I haven't had the chance to look at
>> this more, but before that I was hacking around a bit with something I
>> named .partial_sync_state(). .sync_state() gets called when all the
>> consumers have probed, but .partial_sync_state() gets called when _a_
>> consumer has been probed.
>>
>> For the .sync_state() things are easy for the driver, as it knows
>> everything related has been probed, but for .partial_sync_state() the
>> driver needs to track resources internally. .partial_sync_state() will
>> tell the driver that a consumer device has probed, the driver can then
>> find out which specific resources (genpds in my case) that consumer
>> refers to, and then... Well, that's how far I got with my hacks =).
>>
>> So, I don't know if this .partial_sync_state() can even work, but I
>> think we do need something more on top of the .sync_state().
>
> Thanks for the update!
>
> You certainly have a point, but rather than implementing some platform
> specific method, I think we should be able enforce the call to
> ->sync_state(), based upon some condition/timeout - and even if all
> consumers haven't been probed.
Hmm, I think that was already implemented in some of the serieses out
there (or even in mainline already?), as I remember doing some
experiments with it. I don't like it much, though.
With a simple timeout, it'll always be just a bit too early for some
user (nfs mount took a bit more time than expected -> board frozen).
The only condition I can see that would somewhat work is a manual
trigger from the userspace. The boot scripts could then signal the
kernel when all the modules have been loaded and probably a suitable,
platform/use case specific amount of time has passed to allow the
drivers to probe.
It just feels a bit too much of a "let's hope this work" approach.
That said, the timeout/condition is probably acceptable for many cases,
where turning off a resource forcefully will just result in, say, a
temporarily blanked display, or something else that gets fixed if and
when the proper driver is probed.
Unfortunately, here with the case I have, the whole board gets halted if
the display subsystem genpd is turned off and the display driver is
loaded after that.
Tomi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-05-27 12:41 [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
` (3 preceding siblings ...)
2024-06-05 9:34 ` [PATCH/RFC 0/3] " Ulf Hansson
@ 2024-06-11 4:52 ` claudiu beznea
4 siblings, 0 replies; 14+ messages in thread
From: claudiu beznea @ 2024-06-11 4:52 UTC (permalink / raw)
To: Geert Uytterhoeven, Ulf Hansson, Greg Kroah-Hartman, Jiri Slaby,
Rafael J . Wysocki, Rob Herring, Saravana Kannan
Cc: Claudiu Beznea, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel
Hi, Geert,
On 27.05.2024 15:41, Geert Uytterhoeven wrote:
> Hi all,
>
> Since commit a47cf07f60dcb02d ("serial: core: Call
> device_set_awake_path() for console port"), the serial driver properly
> handles the case where the serial console is part of the awake path, and
> it looked like we could start removing special serial console handling
> from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
> Unfortunately the devil is in the details, as usual...
>
> Earlycon relies on the serial port to be initialized by the firmware
> and/or bootloader. Linux is not aware of any hardware dependencies that
> must be met to keep the port working, and thus cannot guarantee they
> stay met, until the full serial driver takes over.
>
> E.g. all unused clocks and unused PM Domains are disabled in a late
> initcall. As this happens after the full serial driver has taken over,
> the serial port's clock and/or PM Domain are no longer deemed unused,
> and this is typically not a problem.
>
> However, if the serial port's clock or PM Domain is shared with another
> device, and that other device is runtime-suspended before the full
> serial driver has probed, the serial port's clock and/or PM Domain will
> be disabled inadvertently. Any subsequent serial console output will
> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
> ports share their PM Domain with several other I/O devices. After the
> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
> before the full serial driver takes over, the PM Domain containing the
> early serial port is powered down, causing a lock-up when booted with
> "earlycon".
>
> This RFC patch series aims to provide a mechanism for handling this, and
> to fix it for the PM Domain case:
> 1. The first patch provides a mechanism to let the clock and/or PM
> Domain subsystem or drivers handle this, by exporting the clock and
> PM Domain dependencies for the serial port, as available in the
> system's device tree,
> 2. The second patch introduces a new flag to handle a PM domain that
> must be kept powered-on during early boot, and by setting this flag
> if the PM Domain contains the serial console (originally I handled
> this inside rmobile-sysc, but it turned out to be easy to
> generalize this to other platforms in the core PM Domain code).
> 3. The third patch removes the no longer needed special console
> handling from the R-Mobile SYSC PM Domain driver.
>
> I did not fix the similar clock issue, as it is more complex (there can
> be multiple clocks, and each clock provider can have its own value of
> #clock-cells), and I do not need it for Renesas ARM platforms.
>
> This has been tested on the APE6-EVM, Armadillo-800-EVA, and KZM-A9-GT
> development boards, with and without earlycon, including s2ram with and
> without no_console_suspend.
>
> Notes:
> - This should not be needed on RZ/G3S, where each serial port device
> has its own PM Domain,
For the record, I've tested this series on RZ/G3S. All good with it.
If any, you can add my:
Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Thank you,
Claudiu Beznea
> - drivers/clk/imx/clk.c and drivers/pmdomain/imx/scu-pd.c have special
> handling for the of_stdout device, but is probably not affected, as
> each serial port seems to share its PM Domain only with the serial
> port's clock controller.
>
> Thanks for your comments!
>
> Geert Uytterhoeven (3):
> earlycon: Export clock and PM Domain info from FDT
> pmdomain: core: Avoid earlycon power-down
> pmdomain: renesas: rmobile-sysc: Remove serial console handling
>
> drivers/pmdomain/core.c | 24 ++++++++++++++++--
> drivers/pmdomain/renesas/rmobile-sysc.c | 33 +------------------------
> drivers/tty/serial/earlycon.c | 14 ++++++++++-
> include/linux/pm_domain.h | 4 +++
> include/linux/serial_core.h | 10 ++++++++
> 5 files changed, 50 insertions(+), 35 deletions(-)
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-06-05 11:16 ` Tomi Valkeinen
@ 2024-06-21 1:07 ` Saravana Kannan
2024-06-21 7:07 ` Geert Uytterhoeven
0 siblings, 1 reply; 14+ messages in thread
From: Saravana Kannan @ 2024-06-21 1:07 UTC (permalink / raw)
To: Tomi Valkeinen
Cc: Ulf Hansson, Geert Uytterhoeven, Greg Kroah-Hartman, Jiri Slaby,
Rafael J . Wysocki, Rob Herring, Claudiu Beznea, Peng Fan,
linux-pm, linux-serial, linux-renesas-soc, devicetree,
linux-kernel, Devarsh Thakkar, Laurent Pinchart
On Wed, Jun 5, 2024 at 4:16 AM Tomi Valkeinen
<tomi.valkeinen@ideasonboard.com> wrote:
>
> On 05/06/2024 13:53, Ulf Hansson wrote:
> > On Wed, 5 Jun 2024 at 12:41, Tomi Valkeinen
> > <tomi.valkeinen@ideasonboard.com> wrote:
> >>
> >> Hi Ulf,
> >>
> >> On 05/06/2024 12:34, Ulf Hansson wrote:
> >>> + Tomi
> >>>
> >>> On Mon, 27 May 2024 at 14:41, Geert Uytterhoeven
> >>> <geert+renesas@glider.be> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> Since commit a47cf07f60dcb02d ("serial: core: Call
> >>>> device_set_awake_path() for console port"), the serial driver properly
> >>>> handles the case where the serial console is part of the awake path, and
> >>>> it looked like we could start removing special serial console handling
> >>>> from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
> >>>> Unfortunately the devil is in the details, as usual...
> >>>>
> >>>> Earlycon relies on the serial port to be initialized by the firmware
> >>>> and/or bootloader. Linux is not aware of any hardware dependencies that
> >>>> must be met to keep the port working, and thus cannot guarantee they
> >>>> stay met, until the full serial driver takes over.
> >>>>
> >>>> E.g. all unused clocks and unused PM Domains are disabled in a late
> >>>> initcall. As this happens after the full serial driver has taken over,
> >>>> the serial port's clock and/or PM Domain are no longer deemed unused,
> >>>> and this is typically not a problem.
> >>>>
> >>>> However, if the serial port's clock or PM Domain is shared with another
> >>>> device, and that other device is runtime-suspended before the full
> >>>> serial driver has probed, the serial port's clock and/or PM Domain will
> >>>> be disabled inadvertently. Any subsequent serial console output will
> >>>> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
> >>>> ports share their PM Domain with several other I/O devices. After the
> >>>> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
> >>>> before the full serial driver takes over, the PM Domain containing the
> >>>> early serial port is powered down, causing a lock-up when booted with
> >>>> "earlycon".
> >>>
> >>> Hi Geert,
> >>>
> >>> Thanks for the detailed description of the problem! As pointed out in
> >>> regards to another similar recent patch [1], this is indeed a generic
> >>> problem, not limited to the serial console handling.
> >>>
> >>> At Linaro Connect a few weeks ago I followed up with Saravana from the
> >>> earlier discussions at LPC last fall. We now have a generic solution
> >>> for genpd drafted on plain paper, based on fw_devlink and the
> >>> ->sync_state() callback. I am currently working on the genpd series,
> >>> while Saravana will re-spin the series (can't find the link to the
> >>> last version) for the clock framework. Ideally, we want these things
> >>> to work in a very similar way.
> >>>
> >>> That said, allow me to post the series for genpd in a week or two to
> >>> see if it can solve your problem too, for the serial console.
> >>
> >> Both the genpd and the clock solutions will make suppliers depend on all
> >> their consumers to be probed, right?
> >>
> >> I think it is a solution, and should be worked on, but it has the
> >> drawback that suppliers that have consumers that will possibly never be
> >> probed, will also never be able to turn off unused resources.
> >>
> >> This was specifically the case with the TI ti-sci pmdomain case I was
> >> looking at: the genpd driver (ti_sci_pm_domains.c) provides a lot of
> >> genpds for totally unrelated devices, and so if, e.g., you don't have or
> >> don't want to load a driver for the GPU, all PDs are affected.
> >>
> >> Even here the solutions you mention will help: instead of things getting
> >> broken because genpds get turned off while they are actually in use, the
> >> genpds will be kept enabled, thus fixing the breakage. Unfortunately,
> >> they'll be kept enabled forever.
> >>
> >> I've been ill for quite a while so I haven't had the chance to look at
> >> this more, but before that I was hacking around a bit with something I
> >> named .partial_sync_state(). .sync_state() gets called when all the
> >> consumers have probed, but .partial_sync_state() gets called when _a_
> >> consumer has been probed.
> >>
> >> For the .sync_state() things are easy for the driver, as it knows
> >> everything related has been probed, but for .partial_sync_state() the
> >> driver needs to track resources internally. .partial_sync_state() will
> >> tell the driver that a consumer device has probed, the driver can then
> >> find out which specific resources (genpds in my case) that consumer
> >> refers to, and then... Well, that's how far I got with my hacks =).
> >>
> >> So, I don't know if this .partial_sync_state() can even work, but I
> >> think we do need something more on top of the .sync_state().
> >
> > Thanks for the update!
> >
> > You certainly have a point, but rather than implementing some platform
> > specific method, I think we should be able enforce the call to
> > ->sync_state(), based upon some condition/timeout - and even if all
> > consumers haven't been probed.
>
> Hmm, I think that was already implemented in some of the serieses out
> there (or even in mainline already?), as I remember doing some
> experiments with it. I don't like it much, though.
>
> With a simple timeout, it'll always be just a bit too early for some
> user (nfs mount took a bit more time than expected -> board frozen).
>
> The only condition I can see that would somewhat work is a manual
> trigger from the userspace. The boot scripts could then signal the
> kernel when all the modules have been loaded and probably a suitable,
> platform/use case specific amount of time has passed to allow the
> drivers to probe.
This is also already supported in mainline.
Devices with sync_state() implementations (once Ulf adds it) will have
a state_synced file in sysfs. It shows where it has been called yet or
not. But you can also echo 1 into it to force the sync_state()
callback (only if it hasn't been called already). So, yeah, all
methods of handling this are available if you implement the
sync_state() callback.
By default it's all strict (wait till all consumers probe
successfully). But you can set it to timeout (fw_devlink.sync_state).
And you also have the option I mentioned above that you can use with
both cases.
-Saravana
>
> It just feels a bit too much of a "let's hope this work" approach.
>
> That said, the timeout/condition is probably acceptable for many cases,
> where turning off a resource forcefully will just result in, say, a
> temporarily blanked display, or something else that gets fixed if and
> when the proper driver is probed.
>
> Unfortunately, here with the case I have, the whole board gets halted if
> the display subsystem genpd is turned off and the display driver is
> loaded after that.
>
> Tomi
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-06-21 1:07 ` Saravana Kannan
@ 2024-06-21 7:07 ` Geert Uytterhoeven
2024-06-21 8:49 ` Tomi Valkeinen
2024-07-09 15:20 ` Ulf Hansson
0 siblings, 2 replies; 14+ messages in thread
From: Geert Uytterhoeven @ 2024-06-21 7:07 UTC (permalink / raw)
To: Saravana Kannan
Cc: Tomi Valkeinen, Ulf Hansson, Greg Kroah-Hartman, Jiri Slaby,
Rafael J . Wysocki, Rob Herring, Claudiu Beznea, Peng Fan,
linux-pm, linux-serial, linux-renesas-soc, devicetree,
linux-kernel, Devarsh Thakkar, Laurent Pinchart
Hi Saravana,
On Fri, Jun 21, 2024 at 3:08 AM Saravana Kannan <saravanak@google.com> wrote:
> On Wed, Jun 5, 2024 at 4:16 AM Tomi Valkeinen
> <tomi.valkeinen@ideasonboard.com> wrote:
> > On 05/06/2024 13:53, Ulf Hansson wrote:
> > > On Wed, 5 Jun 2024 at 12:41, Tomi Valkeinen
> > > <tomi.valkeinen@ideasonboard.com> wrote:
> > >> On 05/06/2024 12:34, Ulf Hansson wrote:
> > >>> On Mon, 27 May 2024 at 14:41, Geert Uytterhoeven
> > >>> <geert+renesas@glider.be> wrote:
> > >>>> Since commit a47cf07f60dcb02d ("serial: core: Call
> > >>>> device_set_awake_path() for console port"), the serial driver properly
> > >>>> handles the case where the serial console is part of the awake path, and
> > >>>> it looked like we could start removing special serial console handling
> > >>>> from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
> > >>>> Unfortunately the devil is in the details, as usual...
> > >>>>
> > >>>> Earlycon relies on the serial port to be initialized by the firmware
> > >>>> and/or bootloader. Linux is not aware of any hardware dependencies that
> > >>>> must be met to keep the port working, and thus cannot guarantee they
> > >>>> stay met, until the full serial driver takes over.
> > >>>>
> > >>>> E.g. all unused clocks and unused PM Domains are disabled in a late
> > >>>> initcall. As this happens after the full serial driver has taken over,
> > >>>> the serial port's clock and/or PM Domain are no longer deemed unused,
> > >>>> and this is typically not a problem.
Let's call this "Case A".
> > >>>>
> > >>>> However, if the serial port's clock or PM Domain is shared with another
> > >>>> device, and that other device is runtime-suspended before the full
> > >>>> serial driver has probed, the serial port's clock and/or PM Domain will
> > >>>> be disabled inadvertently. Any subsequent serial console output will
> > >>>> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
> > >>>> ports share their PM Domain with several other I/O devices. After the
> > >>>> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
> > >>>> before the full serial driver takes over, the PM Domain containing the
> > >>>> early serial port is powered down, causing a lock-up when booted with
> > >>>> "earlycon".
Let's call this "Case B".
> > >>>
> > >>> Thanks for the detailed description of the problem! As pointed out in
> > >>> regards to another similar recent patch [1], this is indeed a generic
> > >>> problem, not limited to the serial console handling.
> > >>>
> > >>> At Linaro Connect a few weeks ago I followed up with Saravana from the
> > >>> earlier discussions at LPC last fall. We now have a generic solution
> > >>> for genpd drafted on plain paper, based on fw_devlink and the
> > >>> ->sync_state() callback. I am currently working on the genpd series,
> > >>> while Saravana will re-spin the series (can't find the link to the
> > >>> last version) for the clock framework. Ideally, we want these things
> > >>> to work in a very similar way.
> > >>>
> > >>> That said, allow me to post the series for genpd in a week or two to
> > >>> see if it can solve your problem too, for the serial console.
> > >>
> > >> Both the genpd and the clock solutions will make suppliers depend on all
> > >> their consumers to be probed, right?
> > >>
> > >> I think it is a solution, and should be worked on, but it has the
> > >> drawback that suppliers that have consumers that will possibly never be
> > >> probed, will also never be able to turn off unused resources.
> > >>
> > >> This was specifically the case with the TI ti-sci pmdomain case I was
> > >> looking at: the genpd driver (ti_sci_pm_domains.c) provides a lot of
> > >> genpds for totally unrelated devices, and so if, e.g., you don't have or
> > >> don't want to load a driver for the GPU, all PDs are affected.
> > >>
> > >> Even here the solutions you mention will help: instead of things getting
> > >> broken because genpds get turned off while they are actually in use, the
> > >> genpds will be kept enabled, thus fixing the breakage. Unfortunately,
> > >> they'll be kept enabled forever.
> > >>
> > >> I've been ill for quite a while so I haven't had the chance to look at
> > >> this more, but before that I was hacking around a bit with something I
> > >> named .partial_sync_state(). .sync_state() gets called when all the
> > >> consumers have probed, but .partial_sync_state() gets called when _a_
> > >> consumer has been probed.
> > >>
> > >> For the .sync_state() things are easy for the driver, as it knows
> > >> everything related has been probed, but for .partial_sync_state() the
> > >> driver needs to track resources internally. .partial_sync_state() will
> > >> tell the driver that a consumer device has probed, the driver can then
> > >> find out which specific resources (genpds in my case) that consumer
> > >> refers to, and then... Well, that's how far I got with my hacks =).
> > >>
> > >> So, I don't know if this .partial_sync_state() can even work, but I
> > >> think we do need something more on top of the .sync_state().
> > >
> > > Thanks for the update!
> > >
> > > You certainly have a point, but rather than implementing some platform
> > > specific method, I think we should be able enforce the call to
> > > ->sync_state(), based upon some condition/timeout - and even if all
> > > consumers haven't been probed.
> >
> > Hmm, I think that was already implemented in some of the serieses out
> > there (or even in mainline already?), as I remember doing some
> > experiments with it. I don't like it much, though.
> >
> > With a simple timeout, it'll always be just a bit too early for some
> > user (nfs mount took a bit more time than expected -> board frozen).
> >
> > The only condition I can see that would somewhat work is a manual
> > trigger from the userspace. The boot scripts could then signal the
> > kernel when all the modules have been loaded and probably a suitable,
> > platform/use case specific amount of time has passed to allow the
> > drivers to probe.
>
> This is also already supported in mainline.
>
> Devices with sync_state() implementations (once Ulf adds it) will have
> a state_synced file in sysfs. It shows where it has been called yet or
> not. But you can also echo 1 into it to force the sync_state()
> callback (only if it hasn't been called already). So, yeah, all
> methods of handling this are available if you implement the
> sync_state() callback.
>
> By default it's all strict (wait till all consumers probe
> successfully). But you can set it to timeout (fw_devlink.sync_state).
> And you also have the option I mentioned above that you can use with
> both cases.
So the idea is to disable unused genpds and clocks from the genpd
resp. clock's driver .sync_state() callback, instead of from a late
initcall? That would indeed solve issues related to "Case A".
However, how to solve "Case B"? Ignore disabling genpds or clocks
before .sync_state() callback() has been called?
That would cause issues for cases where the clock must be disabled,
cfr.
"[PATCH RFC 0/3] Add clk_disable_unprepare_sync()"
https://lore.kernel.org/all/20240131160947.96171-1-biju.das.jz@bp.renesas.com/
"[PATCH v3 0/3] Add clk_poll_disable_unprepare()"
https://lore.kernel.org/linux-renesas-soc/20240318110842.41956-1-biju.das.jz@bp.renesas.com/
> > It just feels a bit too much of a "let's hope this work" approach.
> >
> > That said, the timeout/condition is probably acceptable for many cases,
> > where turning off a resource forcefully will just result in, say, a
> > temporarily blanked display, or something else that gets fixed if and
> > when the proper driver is probed.
> >
> > Unfortunately, here with the case I have, the whole board gets halted if
> > the display subsystem genpd is turned off and the display driver is
> > loaded after that.
Tomi: Do you have more details? The genpd must be controlling something
critical that must never be turned off, or perhaps the display driver
lacks some initialization?
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-06-21 7:07 ` Geert Uytterhoeven
@ 2024-06-21 8:49 ` Tomi Valkeinen
2024-07-09 15:20 ` Ulf Hansson
1 sibling, 0 replies; 14+ messages in thread
From: Tomi Valkeinen @ 2024-06-21 8:49 UTC (permalink / raw)
To: Geert Uytterhoeven, Saravana Kannan
Cc: Ulf Hansson, Greg Kroah-Hartman, Jiri Slaby, Rafael J . Wysocki,
Rob Herring, Claudiu Beznea, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel, Devarsh Thakkar,
Laurent Pinchart
On 21/06/2024 10:07, Geert Uytterhoeven wrote:
> Hi Saravana,
>
> On Fri, Jun 21, 2024 at 3:08 AM Saravana Kannan <saravanak@google.com> wrote:
>> On Wed, Jun 5, 2024 at 4:16 AM Tomi Valkeinen
>> <tomi.valkeinen@ideasonboard.com> wrote:
>>> On 05/06/2024 13:53, Ulf Hansson wrote:
>>>> On Wed, 5 Jun 2024 at 12:41, Tomi Valkeinen
>>>> <tomi.valkeinen@ideasonboard.com> wrote:
>>>>> On 05/06/2024 12:34, Ulf Hansson wrote:
>>>>>> On Mon, 27 May 2024 at 14:41, Geert Uytterhoeven
>>>>>> <geert+renesas@glider.be> wrote:
>>>>>>> Since commit a47cf07f60dcb02d ("serial: core: Call
>>>>>>> device_set_awake_path() for console port"), the serial driver properly
>>>>>>> handles the case where the serial console is part of the awake path, and
>>>>>>> it looked like we could start removing special serial console handling
>>>>>>> from PM Domain drivers like the R-Mobile SYSC PM Domain driver.
>>>>>>> Unfortunately the devil is in the details, as usual...
>>>>>>>
>>>>>>> Earlycon relies on the serial port to be initialized by the firmware
>>>>>>> and/or bootloader. Linux is not aware of any hardware dependencies that
>>>>>>> must be met to keep the port working, and thus cannot guarantee they
>>>>>>> stay met, until the full serial driver takes over.
>>>>>>>
>>>>>>> E.g. all unused clocks and unused PM Domains are disabled in a late
>>>>>>> initcall. As this happens after the full serial driver has taken over,
>>>>>>> the serial port's clock and/or PM Domain are no longer deemed unused,
>>>>>>> and this is typically not a problem.
>
> Let's call this "Case A".
>
>>>>>>>
>>>>>>> However, if the serial port's clock or PM Domain is shared with another
>>>>>>> device, and that other device is runtime-suspended before the full
>>>>>>> serial driver has probed, the serial port's clock and/or PM Domain will
>>>>>>> be disabled inadvertently. Any subsequent serial console output will
>>>>>>> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
>>>>>>> ports share their PM Domain with several other I/O devices. After the
>>>>>>> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
>>>>>>> before the full serial driver takes over, the PM Domain containing the
>>>>>>> early serial port is powered down, causing a lock-up when booted with
>>>>>>> "earlycon".
>
> Let's call this "Case B".
>
>>>>>>
>>>>>> Thanks for the detailed description of the problem! As pointed out in
>>>>>> regards to another similar recent patch [1], this is indeed a generic
>>>>>> problem, not limited to the serial console handling.
>>>>>>
>>>>>> At Linaro Connect a few weeks ago I followed up with Saravana from the
>>>>>> earlier discussions at LPC last fall. We now have a generic solution
>>>>>> for genpd drafted on plain paper, based on fw_devlink and the
>>>>>> ->sync_state() callback. I am currently working on the genpd series,
>>>>>> while Saravana will re-spin the series (can't find the link to the
>>>>>> last version) for the clock framework. Ideally, we want these things
>>>>>> to work in a very similar way.
>>>>>>
>>>>>> That said, allow me to post the series for genpd in a week or two to
>>>>>> see if it can solve your problem too, for the serial console.
>>>>>
>>>>> Both the genpd and the clock solutions will make suppliers depend on all
>>>>> their consumers to be probed, right?
>>>>>
>>>>> I think it is a solution, and should be worked on, but it has the
>>>>> drawback that suppliers that have consumers that will possibly never be
>>>>> probed, will also never be able to turn off unused resources.
>>>>>
>>>>> This was specifically the case with the TI ti-sci pmdomain case I was
>>>>> looking at: the genpd driver (ti_sci_pm_domains.c) provides a lot of
>>>>> genpds for totally unrelated devices, and so if, e.g., you don't have or
>>>>> don't want to load a driver for the GPU, all PDs are affected.
>>>>>
>>>>> Even here the solutions you mention will help: instead of things getting
>>>>> broken because genpds get turned off while they are actually in use, the
>>>>> genpds will be kept enabled, thus fixing the breakage. Unfortunately,
>>>>> they'll be kept enabled forever.
>>>>>
>>>>> I've been ill for quite a while so I haven't had the chance to look at
>>>>> this more, but before that I was hacking around a bit with something I
>>>>> named .partial_sync_state(). .sync_state() gets called when all the
>>>>> consumers have probed, but .partial_sync_state() gets called when _a_
>>>>> consumer has been probed.
>>>>>
>>>>> For the .sync_state() things are easy for the driver, as it knows
>>>>> everything related has been probed, but for .partial_sync_state() the
>>>>> driver needs to track resources internally. .partial_sync_state() will
>>>>> tell the driver that a consumer device has probed, the driver can then
>>>>> find out which specific resources (genpds in my case) that consumer
>>>>> refers to, and then... Well, that's how far I got with my hacks =).
>>>>>
>>>>> So, I don't know if this .partial_sync_state() can even work, but I
>>>>> think we do need something more on top of the .sync_state().
>>>>
>>>> Thanks for the update!
>>>>
>>>> You certainly have a point, but rather than implementing some platform
>>>> specific method, I think we should be able enforce the call to
>>>> ->sync_state(), based upon some condition/timeout - and even if all
>>>> consumers haven't been probed.
>>>
>>> Hmm, I think that was already implemented in some of the serieses out
>>> there (or even in mainline already?), as I remember doing some
>>> experiments with it. I don't like it much, though.
>>>
>>> With a simple timeout, it'll always be just a bit too early for some
>>> user (nfs mount took a bit more time than expected -> board frozen).
>>>
>>> The only condition I can see that would somewhat work is a manual
>>> trigger from the userspace. The boot scripts could then signal the
>>> kernel when all the modules have been loaded and probably a suitable,
>>> platform/use case specific amount of time has passed to allow the
>>> drivers to probe.
>>
>> This is also already supported in mainline.
>>
>> Devices with sync_state() implementations (once Ulf adds it) will have
>> a state_synced file in sysfs. It shows where it has been called yet or
>> not. But you can also echo 1 into it to force the sync_state()
>> callback (only if it hasn't been called already). So, yeah, all
>> methods of handling this are available if you implement the
>> sync_state() callback.
>>
>> By default it's all strict (wait till all consumers probe
>> successfully). But you can set it to timeout (fw_devlink.sync_state).
>> And you also have the option I mentioned above that you can use with
>> both cases.
>
> So the idea is to disable unused genpds and clocks from the genpd
> resp. clock's driver .sync_state() callback, instead of from a late
> initcall? That would indeed solve issues related to "Case A".
>
> However, how to solve "Case B"? Ignore disabling genpds or clocks
> before .sync_state() callback() has been called?
> That would cause issues for cases where the clock must be disabled,
> cfr.
> "[PATCH RFC 0/3] Add clk_disable_unprepare_sync()"
> https://lore.kernel.org/all/20240131160947.96171-1-biju.das.jz@bp.renesas.com/
> "[PATCH v3 0/3] Add clk_poll_disable_unprepare()"
> https://lore.kernel.org/linux-renesas-soc/20240318110842.41956-1-biju.das.jz@bp.renesas.com/
>
>>> It just feels a bit too much of a "let's hope this work" approach.
>>>
>>> That said, the timeout/condition is probably acceptable for many cases,
>>> where turning off a resource forcefully will just result in, say, a
>>> temporarily blanked display, or something else that gets fixed if and
>>> when the proper driver is probed.
>>>
>>> Unfortunately, here with the case I have, the whole board gets halted if
>>> the display subsystem genpd is turned off and the display driver is
>>> loaded after that.
>
> Tomi: Do you have more details? The genpd must be controlling something
> critical that must never be turned off, or perhaps the display driver
> lacks some initialization?
I don't know the exact HW level details. It may be a HW bug or possibly
a firmware issue. But what I see is simple:
If the display subsystem is powered and a video output is enabled,
turning off the PD causes the display subsystem to go to a bad state,
after which the next register access will halt the cpu.
This happens quite easily if the bootloader has enabled a display: when
the kernel's tidss driver probes, the device framework will make sure
the PD is enabled (which is fine, it's basically a no-op). But if the
tidss returns an error, like EPROBE_DEFER, the device framework will
disable the PD. When the tidss driver probes again later, it will cause
a halt at a register access.
Tomi
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling
2024-06-21 7:07 ` Geert Uytterhoeven
2024-06-21 8:49 ` Tomi Valkeinen
@ 2024-07-09 15:20 ` Ulf Hansson
1 sibling, 0 replies; 14+ messages in thread
From: Ulf Hansson @ 2024-07-09 15:20 UTC (permalink / raw)
To: Geert Uytterhoeven, Claudiu Beznea, Tomi Valkeinen
Cc: Saravana Kannan, Greg Kroah-Hartman, Jiri Slaby,
Rafael J . Wysocki, Rob Herring, Peng Fan, linux-pm, linux-serial,
linux-renesas-soc, devicetree, linux-kernel, Devarsh Thakkar,
Laurent Pinchart
[...]
> > > >>>>
> > > >>>> However, if the serial port's clock or PM Domain is shared with another
> > > >>>> device, and that other device is runtime-suspended before the full
> > > >>>> serial driver has probed, the serial port's clock and/or PM Domain will
> > > >>>> be disabled inadvertently. Any subsequent serial console output will
> > > >>>> cause a crash or system lock-up. E.g. on R/SH-Mobile SoCs, the serial
> > > >>>> ports share their PM Domain with several other I/O devices. After the
> > > >>>> use of pwm (Armadillo-800-EVA) or i2c (KZM-A9-GT) during early boot,
> > > >>>> before the full serial driver takes over, the PM Domain containing the
> > > >>>> early serial port is powered down, causing a lock-up when booted with
> > > >>>> "earlycon".
>
> Let's call this "Case B".
>
> > > >>>
> > > >>> Thanks for the detailed description of the problem! As pointed out in
> > > >>> regards to another similar recent patch [1], this is indeed a generic
> > > >>> problem, not limited to the serial console handling.
> > > >>>
> > > >>> At Linaro Connect a few weeks ago I followed up with Saravana from the
> > > >>> earlier discussions at LPC last fall. We now have a generic solution
> > > >>> for genpd drafted on plain paper, based on fw_devlink and the
> > > >>> ->sync_state() callback. I am currently working on the genpd series,
> > > >>> while Saravana will re-spin the series (can't find the link to the
> > > >>> last version) for the clock framework. Ideally, we want these things
> > > >>> to work in a very similar way.
> > > >>>
> > > >>> That said, allow me to post the series for genpd in a week or two to
> > > >>> see if it can solve your problem too, for the serial console.
I managed to hit the vacation period before I was able to post the
series. I will pick it up this week and hopefully should be able to
post something next week.
> > > >>
> > > >> Both the genpd and the clock solutions will make suppliers depend on all
> > > >> their consumers to be probed, right?
> > > >>
> > > >> I think it is a solution, and should be worked on, but it has the
> > > >> drawback that suppliers that have consumers that will possibly never be
> > > >> probed, will also never be able to turn off unused resources.
> > > >>
> > > >> This was specifically the case with the TI ti-sci pmdomain case I was
> > > >> looking at: the genpd driver (ti_sci_pm_domains.c) provides a lot of
> > > >> genpds for totally unrelated devices, and so if, e.g., you don't have or
> > > >> don't want to load a driver for the GPU, all PDs are affected.
> > > >>
> > > >> Even here the solutions you mention will help: instead of things getting
> > > >> broken because genpds get turned off while they are actually in use, the
> > > >> genpds will be kept enabled, thus fixing the breakage. Unfortunately,
> > > >> they'll be kept enabled forever.
> > > >>
> > > >> I've been ill for quite a while so I haven't had the chance to look at
> > > >> this more, but before that I was hacking around a bit with something I
> > > >> named .partial_sync_state(). .sync_state() gets called when all the
> > > >> consumers have probed, but .partial_sync_state() gets called when _a_
> > > >> consumer has been probed.
> > > >>
> > > >> For the .sync_state() things are easy for the driver, as it knows
> > > >> everything related has been probed, but for .partial_sync_state() the
> > > >> driver needs to track resources internally. .partial_sync_state() will
> > > >> tell the driver that a consumer device has probed, the driver can then
> > > >> find out which specific resources (genpds in my case) that consumer
> > > >> refers to, and then... Well, that's how far I got with my hacks =).
> > > >>
> > > >> So, I don't know if this .partial_sync_state() can even work, but I
> > > >> think we do need something more on top of the .sync_state().
> > > >
> > > > Thanks for the update!
> > > >
> > > > You certainly have a point, but rather than implementing some platform
> > > > specific method, I think we should be able enforce the call to
> > > > ->sync_state(), based upon some condition/timeout - and even if all
> > > > consumers haven't been probed.
> > >
> > > Hmm, I think that was already implemented in some of the serieses out
> > > there (or even in mainline already?), as I remember doing some
> > > experiments with it. I don't like it much, though.
> > >
> > > With a simple timeout, it'll always be just a bit too early for some
> > > user (nfs mount took a bit more time than expected -> board frozen).
> > >
> > > The only condition I can see that would somewhat work is a manual
> > > trigger from the userspace. The boot scripts could then signal the
> > > kernel when all the modules have been loaded and probably a suitable,
> > > platform/use case specific amount of time has passed to allow the
> > > drivers to probe.
> >
> > This is also already supported in mainline.
> >
> > Devices with sync_state() implementations (once Ulf adds it) will have
> > a state_synced file in sysfs. It shows where it has been called yet or
> > not. But you can also echo 1 into it to force the sync_state()
> > callback (only if it hasn't been called already). So, yeah, all
> > methods of handling this are available if you implement the
> > sync_state() callback.
> >
> > By default it's all strict (wait till all consumers probe
> > successfully). But you can set it to timeout (fw_devlink.sync_state).
> > And you also have the option I mentioned above that you can use with
> > both cases.
>
> So the idea is to disable unused genpds and clocks from the genpd
> resp. clock's driver .sync_state() callback, instead of from a late
> initcall? That would indeed solve issues related to "Case A".
>
> However, how to solve "Case B"? Ignore disabling genpds or clocks
> before .sync_state() callback() has been called?
> That would cause issues for cases where the clock must be disabled,
> cfr.
> "[PATCH RFC 0/3] Add clk_disable_unprepare_sync()"
> https://lore.kernel.org/all/20240131160947.96171-1-biju.das.jz@bp.renesas.com/
> "[PATCH v3 0/3] Add clk_poll_disable_unprepare()"
> https://lore.kernel.org/linux-renesas-soc/20240318110842.41956-1-biju.das.jz@bp.renesas.com/
>
For genpd, the plan is to check the initial state of the PM domain. It
can be powered-on or powered-off and if it's powered-on, we should not
allow it to be powered-off until after ->sync_state() have been
called.
The similar approach is what Saravanna is trying to implement for
clocks, I think.
In the end, we simply need to try out these approaches to see if they
really work. Although, based on previous discussions (LKML +
F2F-conferences), I think there should be a good chance for us.
[...]
Kind regards
Uffe
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-07-09 15:21 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-27 12:41 [PATCH/RFC 0/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
2024-05-27 12:41 ` [PATCH/RFC 1/3] earlycon: Export clock and PM Domain info from FDT Geert Uytterhoeven
2024-05-29 9:01 ` Geert Uytterhoeven
2024-05-27 12:41 ` [PATCH/RFC 2/3] pmdomain: core: Avoid earlycon power-down Geert Uytterhoeven
2024-05-27 12:41 ` [PATCHPATCH 3/3] pmdomain: renesas: rmobile-sysc: Remove serial console handling Geert Uytterhoeven
2024-06-05 9:34 ` [PATCH/RFC 0/3] " Ulf Hansson
2024-06-05 10:41 ` Tomi Valkeinen
2024-06-05 10:53 ` Ulf Hansson
2024-06-05 11:16 ` Tomi Valkeinen
2024-06-21 1:07 ` Saravana Kannan
2024-06-21 7:07 ` Geert Uytterhoeven
2024-06-21 8:49 ` Tomi Valkeinen
2024-07-09 15:20 ` Ulf Hansson
2024-06-11 4:52 ` claudiu beznea
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).