* [PATCH v1 1/9] Revert "irqchip/sifive-plic: Chain to parent IRQ after handlers are ready"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 2/9] Revert "irqchip/sifive-plic: Avoid explicit cpumask allocation on stack" Emil Renner Berthing
` (9 subsequent siblings)
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit e306a894bd511804ba9db7c00ca9cc05b55df1f2.
This is a prerequisite to reverting the patch converting the PLIC into a
platform driver. Unfortunately this breaks booting the Allwinner D1 SoC.
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 34 +++++++++++++++----------------
1 file changed, 17 insertions(+), 17 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 9e22f7e378f5..8fb183ced1e7 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -85,7 +85,7 @@ struct plic_handler {
struct plic_priv *priv;
};
static int plic_parent_irq __ro_after_init;
-static bool plic_global_setup_done __ro_after_init;
+static bool plic_cpuhp_setup_done __ro_after_init;
static DEFINE_PER_CPU(struct plic_handler, plic_handlers);
static int plic_irq_set_type(struct irq_data *d, unsigned int type);
@@ -487,8 +487,10 @@ static int plic_probe(struct platform_device *pdev)
unsigned long plic_quirks = 0;
struct plic_handler *handler;
u32 nr_irqs, parent_hwirq;
+ struct irq_domain *domain;
struct plic_priv *priv;
irq_hw_number_t hwirq;
+ bool cpuhp_setup;
if (is_of_node(dev->fwnode)) {
const struct of_device_id *id;
@@ -547,6 +549,14 @@ static int plic_probe(struct platform_device *pdev)
continue;
}
+ /* Find parent domain and register chained handler */
+ domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(), DOMAIN_BUS_ANY);
+ if (!plic_parent_irq && domain) {
+ plic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (plic_parent_irq)
+ irq_set_chained_handler(plic_parent_irq, plic_handle_irq);
+ }
+
/*
* When running in M-mode we need to ignore the S-mode handler.
* Here we assume it always comes later, but that might be a
@@ -587,35 +597,25 @@ static int plic_probe(struct platform_device *pdev)
goto fail_cleanup_contexts;
/*
- * We can have multiple PLIC instances so setup global state
+ * We can have multiple PLIC instances so setup cpuhp state
* and register syscore operations only once after context
* handlers of all online CPUs are initialized.
*/
- if (!plic_global_setup_done) {
- struct irq_domain *domain;
- bool global_setup = true;
-
+ if (!plic_cpuhp_setup_done) {
+ cpuhp_setup = true;
for_each_online_cpu(cpu) {
handler = per_cpu_ptr(&plic_handlers, cpu);
if (!handler->present) {
- global_setup = false;
+ cpuhp_setup = false;
break;
}
}
-
- if (global_setup) {
- /* Find parent domain and register chained handler */
- domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(), DOMAIN_BUS_ANY);
- if (domain)
- plic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
- if (plic_parent_irq)
- irq_set_chained_handler(plic_parent_irq, plic_handle_irq);
-
+ if (cpuhp_setup) {
cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
"irqchip/sifive/plic:starting",
plic_starting_cpu, plic_dying_cpu);
register_syscore_ops(&plic_irq_syscore_ops);
- plic_global_setup_done = true;
+ plic_cpuhp_setup_done = true;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH v1 2/9] Revert "irqchip/sifive-plic: Avoid explicit cpumask allocation on stack"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 1/9] Revert "irqchip/sifive-plic: Chain to parent IRQ after handlers are ready" Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 3/9] Revert "irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore" Emil Renner Berthing
` (8 subsequent siblings)
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit a7fb69ffd7ce438a259b2f9fbcebc62f5caf2d4f.
This is a prerequisite to reverting the patch converting the PLIC into a
platform driver. Unfortunately this breaks booting the Allwinner D1 SoC.
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 8fb183ced1e7..f3d4cb9e34f7 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -164,12 +164,15 @@ static int plic_set_affinity(struct irq_data *d,
const struct cpumask *mask_val, bool force)
{
unsigned int cpu;
+ struct cpumask amask;
struct plic_priv *priv = irq_data_get_irq_chip_data(d);
+ cpumask_and(&amask, &priv->lmask, mask_val);
+
if (force)
- cpu = cpumask_first_and(&priv->lmask, mask_val);
+ cpu = cpumask_first(&amask);
else
- cpu = cpumask_first_and_and(&priv->lmask, mask_val, cpu_online_mask);
+ cpu = cpumask_any_and(&amask, cpu_online_mask);
if (cpu >= nr_cpu_ids)
return -EINVAL;
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH v1 3/9] Revert "irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 1/9] Revert "irqchip/sifive-plic: Chain to parent IRQ after handlers are ready" Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 2/9] Revert "irqchip/sifive-plic: Avoid explicit cpumask allocation on stack" Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 4/9] Revert "irqchip/sifive-plic: Parse number of interrupts and contexts early in plic_probe()" Emil Renner Berthing
` (7 subsequent siblings)
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit abb7205794900503d6358ef1fb645373753a794d.
This is a prerequisite to reverting the patch converting the PLIC into a
platform driver. Unfortunately this breaks booting the Allwinner D1 SoC.
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index f3d4cb9e34f7..cbccd1da3ea1 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -103,11 +103,9 @@ static void __plic_toggle(void __iomem *enable_base, int hwirq, int enable)
static void plic_toggle(struct plic_handler *handler, int hwirq, int enable)
{
- unsigned long flags;
-
- raw_spin_lock_irqsave(&handler->enable_lock, flags);
+ raw_spin_lock(&handler->enable_lock);
__plic_toggle(handler->enable_base, hwirq, enable);
- raw_spin_unlock_irqrestore(&handler->enable_lock, flags);
+ raw_spin_unlock(&handler->enable_lock);
}
static inline void plic_irq_toggle(const struct cpumask *mask,
@@ -244,7 +242,6 @@ static int plic_irq_set_type(struct irq_data *d, unsigned int type)
static int plic_irq_suspend(void)
{
unsigned int i, cpu;
- unsigned long flags;
u32 __iomem *reg;
struct plic_priv *priv;
@@ -262,12 +259,12 @@ static int plic_irq_suspend(void)
if (!handler->present)
continue;
- raw_spin_lock_irqsave(&handler->enable_lock, flags);
+ raw_spin_lock(&handler->enable_lock);
for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {
reg = handler->enable_base + i * sizeof(u32);
handler->enable_save[i] = readl(reg);
}
- raw_spin_unlock_irqrestore(&handler->enable_lock, flags);
+ raw_spin_unlock(&handler->enable_lock);
}
return 0;
@@ -276,7 +273,6 @@ static int plic_irq_suspend(void)
static void plic_irq_resume(void)
{
unsigned int i, index, cpu;
- unsigned long flags;
u32 __iomem *reg;
struct plic_priv *priv;
@@ -294,12 +290,12 @@ static void plic_irq_resume(void)
if (!handler->present)
continue;
- raw_spin_lock_irqsave(&handler->enable_lock, flags);
+ raw_spin_lock(&handler->enable_lock);
for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {
reg = handler->enable_base + i * sizeof(u32);
writel(handler->enable_save[i], reg);
}
- raw_spin_unlock_irqrestore(&handler->enable_lock, flags);
+ raw_spin_unlock(&handler->enable_lock);
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH v1 4/9] Revert "irqchip/sifive-plic: Parse number of interrupts and contexts early in plic_probe()"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
` (2 preceding siblings ...)
2024-08-14 14:56 ` [PATCH v1 3/9] Revert "irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore" Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 5/9] Revert "irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain creation failure" Emil Renner Berthing
` (6 subsequent siblings)
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit 95652106478030f54620b1f0d28f78ab110b3212.
This is a prerequisite to reverting the patch converting the PLIC into a
platform driver. Unfortunately this breaks booting the Allwinner D1 SoC.
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 43 +++++++------------------------
1 file changed, 10 insertions(+), 33 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index cbccd1da3ea1..b4c4050a02fb 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -423,34 +423,6 @@ static const struct of_device_id plic_match[] = {
{}
};
-static int plic_parse_nr_irqs_and_contexts(struct platform_device *pdev,
- u32 *nr_irqs, u32 *nr_contexts)
-{
- struct device *dev = &pdev->dev;
- int rc;
-
- /*
- * Currently, only OF fwnode is supported so extend this
- * function for ACPI support.
- */
- if (!is_of_node(dev->fwnode))
- return -EINVAL;
-
- rc = of_property_read_u32(to_of_node(dev->fwnode), "riscv,ndev", nr_irqs);
- if (rc) {
- dev_err(dev, "riscv,ndev property not available\n");
- return rc;
- }
-
- *nr_contexts = of_irq_count(to_of_node(dev->fwnode));
- if (WARN_ON(!(*nr_contexts))) {
- dev_err(dev, "no PLIC context available\n");
- return -EINVAL;
- }
-
- return 0;
-}
-
static int plic_parse_context_parent(struct platform_device *pdev, u32 context,
u32 *parent_hwirq, int *parent_cpu)
{
@@ -499,26 +471,31 @@ static int plic_probe(struct platform_device *pdev)
plic_quirks = (unsigned long)id->data;
}
- error = plic_parse_nr_irqs_and_contexts(pdev, &nr_irqs, &nr_contexts);
- if (error)
- return error;
-
priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
priv->dev = dev;
priv->plic_quirks = plic_quirks;
- priv->nr_irqs = nr_irqs;
priv->regs = devm_platform_ioremap_resource(pdev, 0);
if (WARN_ON(!priv->regs))
return -EIO;
+ of_property_read_u32(to_of_node(dev->fwnode), "riscv,ndev", &nr_irqs);
+ if (WARN_ON(!nr_irqs))
+ return -EINVAL;
+
+ priv->nr_irqs = nr_irqs;
+
priv->prio_save = devm_bitmap_zalloc(dev, nr_irqs, GFP_KERNEL);
if (!priv->prio_save)
return -ENOMEM;
+ nr_contexts = of_irq_count(to_of_node(dev->fwnode));
+ if (WARN_ON(!nr_contexts))
+ return -EINVAL;
+
for (i = 0; i < nr_contexts; i++) {
error = plic_parse_context_parent(pdev, i, &parent_hwirq, &cpu);
if (error) {
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH v1 5/9] Revert "irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain creation failure"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
` (3 preceding siblings ...)
2024-08-14 14:56 ` [PATCH v1 4/9] Revert "irqchip/sifive-plic: Parse number of interrupts and contexts early in plic_probe()" Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 6/9] Revert "irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent fwnode" Emil Renner Berthing
` (5 subsequent siblings)
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit a15587277a246c388c83b1cd9cf7c1a868cd752f.
This is a prerequisite to reverting the patch converting the PLIC into a
platform driver. Unfortunately this breaks booting the Allwinner D1 SoC.
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 73 +++++++++----------------------
1 file changed, 20 insertions(+), 53 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index b4c4050a02fb..85e94b8f4c06 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -423,45 +423,17 @@ static const struct of_device_id plic_match[] = {
{}
};
-static int plic_parse_context_parent(struct platform_device *pdev, u32 context,
- u32 *parent_hwirq, int *parent_cpu)
-{
- struct device *dev = &pdev->dev;
- struct of_phandle_args parent;
- unsigned long hartid;
- int rc;
-
- /*
- * Currently, only OF fwnode is supported so extend this
- * function for ACPI support.
- */
- if (!is_of_node(dev->fwnode))
- return -EINVAL;
-
- rc = of_irq_parse_one(to_of_node(dev->fwnode), context, &parent);
- if (rc)
- return rc;
-
- rc = riscv_of_parent_hartid(parent.np, &hartid);
- if (rc)
- return rc;
-
- *parent_hwirq = parent.args[0];
- *parent_cpu = riscv_hartid_to_cpuid(hartid);
- return 0;
-}
-
static int plic_probe(struct platform_device *pdev)
{
- int error = 0, nr_contexts, nr_handlers = 0, cpu, i;
+ int error = 0, nr_contexts, nr_handlers = 0, i;
struct device *dev = &pdev->dev;
unsigned long plic_quirks = 0;
struct plic_handler *handler;
- u32 nr_irqs, parent_hwirq;
struct irq_domain *domain;
struct plic_priv *priv;
- irq_hw_number_t hwirq;
bool cpuhp_setup;
+ unsigned int cpu;
+ u32 nr_irqs;
if (is_of_node(dev->fwnode)) {
const struct of_device_id *id;
@@ -497,9 +469,13 @@ static int plic_probe(struct platform_device *pdev)
return -EINVAL;
for (i = 0; i < nr_contexts; i++) {
- error = plic_parse_context_parent(pdev, i, &parent_hwirq, &cpu);
- if (error) {
- dev_warn(dev, "hwirq for context%d not found\n", i);
+ struct of_phandle_args parent;
+ irq_hw_number_t hwirq;
+ int cpu;
+ unsigned long hartid;
+
+ if (of_irq_parse_one(to_of_node(dev->fwnode), i, &parent)) {
+ dev_err(dev, "failed to parse parent for context %d.\n", i);
continue;
}
@@ -507,7 +483,7 @@ static int plic_probe(struct platform_device *pdev)
* Skip contexts other than external interrupts for our
* privilege level.
*/
- if (parent_hwirq != RV_IRQ_EXT) {
+ if (parent.args[0] != RV_IRQ_EXT) {
/* Disable S-mode enable bits if running in M-mode. */
if (IS_ENABLED(CONFIG_RISCV_M_MODE)) {
void __iomem *enable_base = priv->regs +
@@ -520,6 +496,13 @@ static int plic_probe(struct platform_device *pdev)
continue;
}
+ error = riscv_of_parent_hartid(parent.np, &hartid);
+ if (error < 0) {
+ dev_warn(dev, "failed to parse hart ID for context %d.\n", i);
+ continue;
+ }
+
+ cpu = riscv_hartid_to_cpuid(hartid);
if (cpu < 0) {
dev_warn(dev, "Invalid cpuid for context %d\n", i);
continue;
@@ -557,7 +540,7 @@ static int plic_probe(struct platform_device *pdev)
handler->enable_save = devm_kcalloc(dev, DIV_ROUND_UP(nr_irqs, 32),
sizeof(*handler->enable_save), GFP_KERNEL);
if (!handler->enable_save)
- goto fail_cleanup_contexts;
+ return -ENOMEM;
done:
for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
plic_toggle(handler, hwirq, 0);
@@ -570,7 +553,7 @@ static int plic_probe(struct platform_device *pdev)
priv->irqdomain = irq_domain_add_linear(to_of_node(dev->fwnode), nr_irqs + 1,
&plic_irqdomain_ops, priv);
if (WARN_ON(!priv->irqdomain))
- goto fail_cleanup_contexts;
+ return -ENOMEM;
/*
* We can have multiple PLIC instances so setup cpuhp state
@@ -598,22 +581,6 @@ static int plic_probe(struct platform_device *pdev)
dev_info(dev, "mapped %d interrupts with %d handlers for %d contexts.\n",
nr_irqs, nr_handlers, nr_contexts);
return 0;
-
-fail_cleanup_contexts:
- for (i = 0; i < nr_contexts; i++) {
- if (plic_parse_context_parent(pdev, i, &parent_hwirq, &cpu))
- continue;
- if (parent_hwirq != RV_IRQ_EXT || cpu < 0)
- continue;
-
- handler = per_cpu_ptr(&plic_handlers, cpu);
- handler->present = false;
- handler->hart_base = NULL;
- handler->enable_base = NULL;
- handler->enable_save = NULL;
- handler->priv = NULL;
- }
- return -ENOMEM;
}
static struct platform_driver plic_driver = {
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH v1 6/9] Revert "irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent fwnode"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
` (4 preceding siblings ...)
2024-08-14 14:56 ` [PATCH v1 5/9] Revert "irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain creation failure" Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 7/9] Revert "irqchip/sifive-plic: Use devm_xyz() for managed allocation" Emil Renner Berthing
` (4 subsequent siblings)
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit 6c725f33d67b53f2d302c2c4509deae953fc6ade.
This is a prerequisite to reverting the patch converting the PLIC into a
platform driver. Unfortunately this breaks booting the Allwinner D1 SoC.
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 85e94b8f4c06..7dbc662a229c 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -429,7 +429,6 @@ static int plic_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
unsigned long plic_quirks = 0;
struct plic_handler *handler;
- struct irq_domain *domain;
struct plic_priv *priv;
bool cpuhp_setup;
unsigned int cpu;
@@ -509,11 +508,11 @@ static int plic_probe(struct platform_device *pdev)
}
/* Find parent domain and register chained handler */
- domain = irq_find_matching_fwnode(riscv_get_intc_hwnode(), DOMAIN_BUS_ANY);
- if (!plic_parent_irq && domain) {
- plic_parent_irq = irq_create_mapping(domain, RV_IRQ_EXT);
+ if (!plic_parent_irq && irq_find_host(parent.np)) {
+ plic_parent_irq = irq_of_parse_and_map(to_of_node(dev->fwnode), i);
if (plic_parent_irq)
- irq_set_chained_handler(plic_parent_irq, plic_handle_irq);
+ irq_set_chained_handler(plic_parent_irq,
+ plic_handle_irq);
}
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH v1 7/9] Revert "irqchip/sifive-plic: Use devm_xyz() for managed allocation"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
` (5 preceding siblings ...)
2024-08-14 14:56 ` [PATCH v1 6/9] Revert "irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent fwnode" Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 8/9] Revert "irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz()" Emil Renner Berthing
` (3 subsequent siblings)
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit b68d0ff529a939a118ec52f271be8cad5d99e79a.
This is a prerequisite to reverting the patch converting the PLIC into a
platform driver. Unfortunately this breaks booting the Allwinner D1 SoC.
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 49 +++++++++++++++++++++----------
1 file changed, 33 insertions(+), 16 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 7dbc662a229c..7cf06bbb3098 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -442,30 +442,39 @@ static int plic_probe(struct platform_device *pdev)
plic_quirks = (unsigned long)id->data;
}
- priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
priv->dev = dev;
priv->plic_quirks = plic_quirks;
- priv->regs = devm_platform_ioremap_resource(pdev, 0);
- if (WARN_ON(!priv->regs))
- return -EIO;
+ priv->regs = of_iomap(to_of_node(dev->fwnode), 0);
+ if (WARN_ON(!priv->regs)) {
+ error = -EIO;
+ goto out_free_priv;
+ }
+ error = -EINVAL;
of_property_read_u32(to_of_node(dev->fwnode), "riscv,ndev", &nr_irqs);
if (WARN_ON(!nr_irqs))
- return -EINVAL;
+ goto out_iounmap;
priv->nr_irqs = nr_irqs;
- priv->prio_save = devm_bitmap_zalloc(dev, nr_irqs, GFP_KERNEL);
+ priv->prio_save = bitmap_alloc(nr_irqs, GFP_KERNEL);
if (!priv->prio_save)
- return -ENOMEM;
+ goto out_free_priority_reg;
nr_contexts = of_irq_count(to_of_node(dev->fwnode));
if (WARN_ON(!nr_contexts))
- return -EINVAL;
+ goto out_free_priority_reg;
+
+ error = -ENOMEM;
+ priv->irqdomain = irq_domain_add_linear(to_of_node(dev->fwnode), nr_irqs + 1,
+ &plic_irqdomain_ops, priv);
+ if (WARN_ON(!priv->irqdomain))
+ goto out_free_priority_reg;
for (i = 0; i < nr_contexts; i++) {
struct of_phandle_args parent;
@@ -536,10 +545,10 @@ static int plic_probe(struct platform_device *pdev)
i * CONTEXT_ENABLE_SIZE;
handler->priv = priv;
- handler->enable_save = devm_kcalloc(dev, DIV_ROUND_UP(nr_irqs, 32),
- sizeof(*handler->enable_save), GFP_KERNEL);
+ handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32),
+ sizeof(*handler->enable_save), GFP_KERNEL);
if (!handler->enable_save)
- return -ENOMEM;
+ goto out_free_enable_reg;
done:
for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
plic_toggle(handler, hwirq, 0);
@@ -549,11 +558,6 @@ static int plic_probe(struct platform_device *pdev)
nr_handlers++;
}
- priv->irqdomain = irq_domain_add_linear(to_of_node(dev->fwnode), nr_irqs + 1,
- &plic_irqdomain_ops, priv);
- if (WARN_ON(!priv->irqdomain))
- return -ENOMEM;
-
/*
* We can have multiple PLIC instances so setup cpuhp state
* and register syscore operations only once after context
@@ -580,6 +584,19 @@ static int plic_probe(struct platform_device *pdev)
dev_info(dev, "mapped %d interrupts with %d handlers for %d contexts.\n",
nr_irqs, nr_handlers, nr_contexts);
return 0;
+
+out_free_enable_reg:
+ for_each_cpu(cpu, cpu_present_mask) {
+ handler = per_cpu_ptr(&plic_handlers, cpu);
+ kfree(handler->enable_save);
+ }
+out_free_priority_reg:
+ kfree(priv->prio_save);
+out_iounmap:
+ iounmap(priv->regs);
+out_free_priv:
+ kfree(priv);
+ return error;
}
static struct platform_driver plic_driver = {
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH v1 8/9] Revert "irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz()"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
` (6 preceding siblings ...)
2024-08-14 14:56 ` [PATCH v1 7/9] Revert "irqchip/sifive-plic: Use devm_xyz() for managed allocation" Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 14:56 ` [PATCH v1 9/9] Revert "irqchip/sifive-plic: Convert PLIC driver into a platform driver" Emil Renner Berthing
` (2 subsequent siblings)
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit 25d862e183d4efeb5e8b9843d783c90aaae4b14a.
This is a prerequisite to reverting the patch converting the PLIC into a
platform driver. Unfortunately this breaks booting the Allwinner D1 SoC.
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 7cf06bbb3098..ac274e1166c3 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -3,6 +3,7 @@
* Copyright (C) 2017 SiFive
* Copyright (C) 2018 Christoph Hellwig
*/
+#define pr_fmt(fmt) "plic: " fmt
#include <linux/cpu.h>
#include <linux/interrupt.h>
#include <linux/io.h>
@@ -376,10 +377,9 @@ static void plic_handle_irq(struct irq_desc *desc)
while ((hwirq = readl(claim))) {
int err = generic_handle_domain_irq(handler->priv->irqdomain,
hwirq);
- if (unlikely(err)) {
- dev_warn_ratelimited(handler->priv->dev,
- "can't find mapping for hwirq %lu\n", hwirq);
- }
+ if (unlikely(err))
+ pr_warn_ratelimited("can't find mapping for hwirq %lu\n",
+ hwirq);
}
chained_irq_exit(chip, desc);
@@ -407,7 +407,7 @@ static int plic_starting_cpu(unsigned int cpu)
enable_percpu_irq(plic_parent_irq,
irq_get_trigger_type(plic_parent_irq));
else
- dev_warn(handler->priv->dev, "cpu%d: parent irq not available\n", cpu);
+ pr_warn("cpu%d: parent irq not available\n", cpu);
plic_set_threshold(handler, PLIC_ENABLE_THRESHOLD);
return 0;
@@ -483,7 +483,7 @@ static int plic_probe(struct platform_device *pdev)
unsigned long hartid;
if (of_irq_parse_one(to_of_node(dev->fwnode), i, &parent)) {
- dev_err(dev, "failed to parse parent for context %d.\n", i);
+ pr_err("failed to parse parent for context %d.\n", i);
continue;
}
@@ -506,13 +506,13 @@ static int plic_probe(struct platform_device *pdev)
error = riscv_of_parent_hartid(parent.np, &hartid);
if (error < 0) {
- dev_warn(dev, "failed to parse hart ID for context %d.\n", i);
+ pr_warn("failed to parse hart ID for context %d.\n", i);
continue;
}
cpu = riscv_hartid_to_cpuid(hartid);
if (cpu < 0) {
- dev_warn(dev, "Invalid cpuid for context %d\n", i);
+ pr_warn("Invalid cpuid for context %d\n", i);
continue;
}
@@ -531,7 +531,7 @@ static int plic_probe(struct platform_device *pdev)
*/
handler = per_cpu_ptr(&plic_handlers, cpu);
if (handler->present) {
- dev_warn(dev, "handler already present for context %d.\n", i);
+ pr_warn("handler already present for context %d.\n", i);
plic_set_threshold(handler, PLIC_DISABLE_THRESHOLD);
goto done;
}
@@ -581,8 +581,8 @@ static int plic_probe(struct platform_device *pdev)
}
}
- dev_info(dev, "mapped %d interrupts with %d handlers for %d contexts.\n",
- nr_irqs, nr_handlers, nr_contexts);
+ pr_info("%pOFP: mapped %d interrupts with %d handlers for %d contexts.\n",
+ to_of_node(dev->fwnode), nr_irqs, nr_handlers, nr_contexts);
return 0;
out_free_enable_reg:
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* [PATCH v1 9/9] Revert "irqchip/sifive-plic: Convert PLIC driver into a platform driver"
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
` (7 preceding siblings ...)
2024-08-14 14:56 ` [PATCH v1 8/9] Revert "irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz()" Emil Renner Berthing
@ 2024-08-14 14:56 ` Emil Renner Berthing
2024-08-14 17:30 ` [PATCH v1 0/9] Fix Allwinner D1 boot regression Thomas Gleixner
2024-08-18 14:47 ` Palmer Dabbelt
10 siblings, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-14 14:56 UTC (permalink / raw)
To: linux-kernel, linux-riscv, Anup Patel
Cc: Thomas Gleixner, Paul Walmsley, Samuel Holland, Palmer Dabbelt,
Albert Ou
This reverts commit 8ec99b033147ef3bb8f0a560c24eb1baec3bc0be.
This change makes the Allwinner D1 SoC lock up at boot as described in
the thread below.
Link: https://lore.kernel.org/linux-riscv/CAJM55Z9hGKo4784N3s3DhWw=nMRKZKcmvZ58x7uVBghExhoc9A@mail.gmail.com/
Fixes: 8ec99b033147 ("irqchip/sifive-plic: Convert PLIC driver into a platform driver")
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
---
drivers/irqchip/irq-sifive-plic.c | 103 ++++++++++++------------------
1 file changed, 41 insertions(+), 62 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index ac274e1166c3..bf0b40b0fad4 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -64,7 +64,6 @@
#define PLIC_QUIRK_EDGE_INTERRUPT 0
struct plic_priv {
- struct device *dev;
struct cpumask lmask;
struct irq_domain *irqdomain;
void __iomem *regs;
@@ -413,50 +412,30 @@ static int plic_starting_cpu(unsigned int cpu)
return 0;
}
-static const struct of_device_id plic_match[] = {
- { .compatible = "sifive,plic-1.0.0" },
- { .compatible = "riscv,plic0" },
- { .compatible = "andestech,nceplic100",
- .data = (const void *)BIT(PLIC_QUIRK_EDGE_INTERRUPT) },
- { .compatible = "thead,c900-plic",
- .data = (const void *)BIT(PLIC_QUIRK_EDGE_INTERRUPT) },
- {}
-};
-
-static int plic_probe(struct platform_device *pdev)
+static int __init __plic_init(struct device_node *node,
+ struct device_node *parent,
+ unsigned long plic_quirks)
{
int error = 0, nr_contexts, nr_handlers = 0, i;
- struct device *dev = &pdev->dev;
- unsigned long plic_quirks = 0;
- struct plic_handler *handler;
- struct plic_priv *priv;
- bool cpuhp_setup;
- unsigned int cpu;
u32 nr_irqs;
-
- if (is_of_node(dev->fwnode)) {
- const struct of_device_id *id;
-
- id = of_match_node(plic_match, to_of_node(dev->fwnode));
- if (id)
- plic_quirks = (unsigned long)id->data;
- }
+ struct plic_priv *priv;
+ struct plic_handler *handler;
+ unsigned int cpu;
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
- priv->dev = dev;
priv->plic_quirks = plic_quirks;
- priv->regs = of_iomap(to_of_node(dev->fwnode), 0);
+ priv->regs = of_iomap(node, 0);
if (WARN_ON(!priv->regs)) {
error = -EIO;
goto out_free_priv;
}
error = -EINVAL;
- of_property_read_u32(to_of_node(dev->fwnode), "riscv,ndev", &nr_irqs);
+ of_property_read_u32(node, "riscv,ndev", &nr_irqs);
if (WARN_ON(!nr_irqs))
goto out_iounmap;
@@ -466,13 +445,13 @@ static int plic_probe(struct platform_device *pdev)
if (!priv->prio_save)
goto out_free_priority_reg;
- nr_contexts = of_irq_count(to_of_node(dev->fwnode));
+ nr_contexts = of_irq_count(node);
if (WARN_ON(!nr_contexts))
goto out_free_priority_reg;
error = -ENOMEM;
- priv->irqdomain = irq_domain_add_linear(to_of_node(dev->fwnode), nr_irqs + 1,
- &plic_irqdomain_ops, priv);
+ priv->irqdomain = irq_domain_add_linear(node, nr_irqs + 1,
+ &plic_irqdomain_ops, priv);
if (WARN_ON(!priv->irqdomain))
goto out_free_priority_reg;
@@ -482,7 +461,7 @@ static int plic_probe(struct platform_device *pdev)
int cpu;
unsigned long hartid;
- if (of_irq_parse_one(to_of_node(dev->fwnode), i, &parent)) {
+ if (of_irq_parse_one(node, i, &parent)) {
pr_err("failed to parse parent for context %d.\n", i);
continue;
}
@@ -518,7 +497,7 @@ static int plic_probe(struct platform_device *pdev)
/* Find parent domain and register chained handler */
if (!plic_parent_irq && irq_find_host(parent.np)) {
- plic_parent_irq = irq_of_parse_and_map(to_of_node(dev->fwnode), i);
+ plic_parent_irq = irq_of_parse_and_map(node, i);
if (plic_parent_irq)
irq_set_chained_handler(plic_parent_irq,
plic_handle_irq);
@@ -560,29 +539,20 @@ static int plic_probe(struct platform_device *pdev)
/*
* We can have multiple PLIC instances so setup cpuhp state
- * and register syscore operations only once after context
- * handlers of all online CPUs are initialized.
+ * and register syscore operations only when context handler
+ * for current/boot CPU is present.
*/
- if (!plic_cpuhp_setup_done) {
- cpuhp_setup = true;
- for_each_online_cpu(cpu) {
- handler = per_cpu_ptr(&plic_handlers, cpu);
- if (!handler->present) {
- cpuhp_setup = false;
- break;
- }
- }
- if (cpuhp_setup) {
- cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
- "irqchip/sifive/plic:starting",
- plic_starting_cpu, plic_dying_cpu);
- register_syscore_ops(&plic_irq_syscore_ops);
- plic_cpuhp_setup_done = true;
- }
+ handler = this_cpu_ptr(&plic_handlers);
+ if (handler->present && !plic_cpuhp_setup_done) {
+ cpuhp_setup_state(CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
+ "irqchip/sifive/plic:starting",
+ plic_starting_cpu, plic_dying_cpu);
+ register_syscore_ops(&plic_irq_syscore_ops);
+ plic_cpuhp_setup_done = true;
}
- pr_info("%pOFP: mapped %d interrupts with %d handlers for %d contexts.\n",
- to_of_node(dev->fwnode), nr_irqs, nr_handlers, nr_contexts);
+ pr_info("%pOFP: mapped %d interrupts with %d handlers for"
+ " %d contexts.\n", node, nr_irqs, nr_handlers, nr_contexts);
return 0;
out_free_enable_reg:
@@ -599,11 +569,20 @@ static int plic_probe(struct platform_device *pdev)
return error;
}
-static struct platform_driver plic_driver = {
- .driver = {
- .name = "riscv-plic",
- .of_match_table = plic_match,
- },
- .probe = plic_probe,
-};
-builtin_platform_driver(plic_driver);
+static int __init plic_init(struct device_node *node,
+ struct device_node *parent)
+{
+ return __plic_init(node, parent, 0);
+}
+
+IRQCHIP_DECLARE(sifive_plic, "sifive,plic-1.0.0", plic_init);
+IRQCHIP_DECLARE(riscv_plic0, "riscv,plic0", plic_init); /* for legacy systems */
+
+static int __init plic_edge_init(struct device_node *node,
+ struct device_node *parent)
+{
+ return __plic_init(node, parent, BIT(PLIC_QUIRK_EDGE_INTERRUPT));
+}
+
+IRQCHIP_DECLARE(andestech_nceplic100, "andestech,nceplic100", plic_edge_init);
+IRQCHIP_DECLARE(thead_c900_plic, "thead,c900-plic", plic_edge_init);
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
` (8 preceding siblings ...)
2024-08-14 14:56 ` [PATCH v1 9/9] Revert "irqchip/sifive-plic: Convert PLIC driver into a platform driver" Emil Renner Berthing
@ 2024-08-14 17:30 ` Thomas Gleixner
2024-08-15 10:29 ` Emil Renner Berthing
2024-08-15 17:51 ` Palmer Dabbelt
2024-08-18 14:47 ` Palmer Dabbelt
10 siblings, 2 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-08-14 17:30 UTC (permalink / raw)
To: Emil Renner Berthing, linux-kernel, linux-riscv, Anup Patel
Cc: Paul Walmsley, Samuel Holland, Palmer Dabbelt, Albert Ou,
Daniel Lezcano
On Wed, Aug 14 2024 at 16:56, Emil Renner Berthing wrote:
> As described in the thread below[1] I haven't been able to boot my
> boards based on the Allwinner D1 SoC since 6.9 where you converted the
> SiFive PLIC driver to a platform driver.
>
> This is clearly a regression and there haven't really been much progress
> on fixing the issue since then, so here is the revert that fixes it.
>
> If no other fix is found before 6.11 I suggest we apply this.
So this mess has been ignored for two month now?
From the pastebin in the initial report:
[ 0.000000] irq: no irq domain found for interrupt-controller@10000000 !
[ 0.000000] Failed to map interrupt for /soc/timer@2050000
[ 0.000000] Failed to initialize '/soc/timer@2050000': -22
This comes back with -EINVAL. So the timer cannot find an interrupt,
which makes it pretty obvious why the system stops to boot, unless there
is some other timer available.
This is obviously related to the SUN4I_TIMER because that message went
away when it was disabled according to the next pastebin.
Obviously that can't work because the SUN4I timer driver is using
timer_of_init() which cannot handle deferred probing.
Daniel: There was a partial fix for the sun4i driver, which you said you
applied:
https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com
But that thing never materialized in a pull request.
And of course everyone involved ignored the problem since March 13th
2024, i.e. almost half a year.
Seriously?
Can you RISCV folks get your act together and ensure to fix things you
broke on the way? Especially when Emil reported this nobody pointed him
to this patch and nobody noticed that it's still not merged?
It took me less than 15 minutes to find that patch and the correlation,
but this is absolutely not my job.
I'm seriously grumpy about that. This is not how it works. If you break
stuff, then you take care to fix it before you shove more changes into
the tree and waste my time.
I'm very much inclined to take the reverts right now, send them to Linus
for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv
patches until the next merge window is over.
Emil, can you give that sun4i fix a test ride please?
Thanks,
tglx
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-14 17:30 ` [PATCH v1 0/9] Fix Allwinner D1 boot regression Thomas Gleixner
@ 2024-08-15 10:29 ` Emil Renner Berthing
2024-08-15 11:44 ` Thomas Gleixner
2024-08-15 17:51 ` Palmer Dabbelt
1 sibling, 1 reply; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-15 10:29 UTC (permalink / raw)
To: Thomas Gleixner, Emil Renner Berthing, linux-kernel, linux-riscv,
Anup Patel
Cc: Paul Walmsley, Samuel Holland, Palmer Dabbelt, Albert Ou,
Daniel Lezcano
Thomas Gleixner wrote:
> On Wed, Aug 14 2024 at 16:56, Emil Renner Berthing wrote:
> > As described in the thread below[1] I haven't been able to boot my
> > boards based on the Allwinner D1 SoC since 6.9 where you converted the
> > SiFive PLIC driver to a platform driver.
> >
> > This is clearly a regression and there haven't really been much progress
> > on fixing the issue since then, so here is the revert that fixes it.
> >
> > If no other fix is found before 6.11 I suggest we apply this.
>
> So this mess has been ignored for two month now?
>
> From the pastebin in the initial report:
>
> [ 0.000000] irq: no irq domain found for interrupt-controller@10000000 !
> [ 0.000000] Failed to map interrupt for /soc/timer@2050000
> [ 0.000000] Failed to initialize '/soc/timer@2050000': -22
>
> This comes back with -EINVAL. So the timer cannot find an interrupt,
> which makes it pretty obvious why the system stops to boot, unless there
> is some other timer available.
>
> This is obviously related to the SUN4I_TIMER because that message went
> away when it was disabled according to the next pastebin.
>
> Obviously that can't work because the SUN4I timer driver is using
> timer_of_init() which cannot handle deferred probing.
>
> Daniel: There was a partial fix for the sun4i driver, which you said you
> applied:
>
> https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com
>
> But that thing never materialized in a pull request.
>
> And of course everyone involved ignored the problem since March 13th
> 2024, i.e. almost half a year.
>
> Seriously?
>
> Can you RISCV folks get your act together and ensure to fix things you
> broke on the way? Especially when Emil reported this nobody pointed him
> to this patch and nobody noticed that it's still not merged?
>
> It took me less than 15 minutes to find that patch and the correlation,
> but this is absolutely not my job.
>
> I'm seriously grumpy about that. This is not how it works. If you break
> stuff, then you take care to fix it before you shove more changes into
> the tree and waste my time.
>
> I'm very much inclined to take the reverts right now, send them to Linus
> for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv
> patches until the next merge window is over.
>
> Emil, can you give that sun4i fix a test ride please?
Hi Thomas,
Thanks for looking at this! Unfortunately the above patch isn't enough to fix
the issue:
https://termbin.com/7sgc
It still hangs after the "[ 0.176451] cpuidle: using governor teo" message
until the watchdog reboots the systems.
/Emil
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 10:29 ` Emil Renner Berthing
@ 2024-08-15 11:44 ` Thomas Gleixner
2024-08-15 12:04 ` Emil Renner Berthing
0 siblings, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2024-08-15 11:44 UTC (permalink / raw)
To: Emil Renner Berthing, Emil Renner Berthing, linux-kernel,
linux-riscv, Anup Patel
Cc: Paul Walmsley, Samuel Holland, Palmer Dabbelt, Albert Ou,
Daniel Lezcano
On Thu, Aug 15 2024 at 03:29, Emil Renner Berthing wrote:
> Thomas Gleixner wrote:
> Thanks for looking at this! Unfortunately the above patch isn't enough to fix
> the issue:
>
> https://termbin.com/7sgc
>
> It still hangs after the "[ 0.176451] cpuidle: using governor teo" message
> until the watchdog reboots the systems.
So what's puzzling is that there is a timer installed early on:
[ 0.000000] clocksource: riscv_clocksource: ....
That same init function installs the per cpu riscv clockevent, so there
should be a timer available.
The deffered probing of the PLIC driver delays obviously the probing of
the sun4i timer, but that should not matter when another timer is
available. So the sun4i driver might be a red herring.
Can you please add "ignore_loglevel initcall_debug" to the command line
and provide the output of a booting and a failing kernel?
And on the booting kernel please provide the output from:
# cat /sys/devices/system/clockevents/clockevent0/current_device
# cat /sys/devices/system/clockevents/broadcast/current_device
# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
Thanks,
tglx
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 11:44 ` Thomas Gleixner
@ 2024-08-15 12:04 ` Emil Renner Berthing
2024-08-15 12:14 ` Emil Renner Berthing
0 siblings, 1 reply; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-15 12:04 UTC (permalink / raw)
To: Thomas Gleixner, Emil Renner Berthing, linux-kernel, linux-riscv,
Anup Patel
Cc: Paul Walmsley, Samuel Holland, Palmer Dabbelt, Albert Ou,
Daniel Lezcano
Thomas Gleixner wrote:
> On Thu, Aug 15 2024 at 03:29, Emil Renner Berthing wrote:
> > Thomas Gleixner wrote:
> > Thanks for looking at this! Unfortunately the above patch isn't enough to fix
> > the issue:
> >
> > https://termbin.com/7sgc
> >
> > It still hangs after the "[ 0.176451] cpuidle: using governor teo" message
> > until the watchdog reboots the systems.
>
> So what's puzzling is that there is a timer installed early on:
>
> [ 0.000000] clocksource: riscv_clocksource: ....
>
> That same init function installs the per cpu riscv clockevent, so there
> should be a timer available.
>
> The deffered probing of the PLIC driver delays obviously the probing of
> the sun4i timer, but that should not matter when another timer is
> available. So the sun4i driver might be a red herring.
>
> Can you please add "ignore_loglevel initcall_debug" to the command line
> and provide the output of a booting and a failing kernel?
6.11-rc3 + these reverts: https://termbin.com/q6wk
6.11-rc3 + Samuel's patch: https://termbin.com/7cgs
> And on the booting kernel please provide the output from:
>
> # cat /sys/devices/system/clockevents/clockevent0/current_device
> # cat /sys/devices/system/clockevents/broadcast/current_device
> # cat /sys/devices/system/clocksource/clocksource0/current_clocksource
On both a 6.8.6 kernel and 6.11-rc3 + reverts I get:
# cat /sys/devices/system/clockevents/clockevent0/current_device
sun4i_tick
# cat /sys/devices/system/clockevents/broadcast/current_device
riscv_timer_clockevent
# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
riscv_clocksource
/Emil
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 12:04 ` Emil Renner Berthing
@ 2024-08-15 12:14 ` Emil Renner Berthing
2024-08-15 13:16 ` Thomas Gleixner
0 siblings, 1 reply; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-15 12:14 UTC (permalink / raw)
To: Emil Renner Berthing, Thomas Gleixner, linux-kernel, linux-riscv,
Anup Patel
Cc: Paul Walmsley, Samuel Holland, Palmer Dabbelt, Albert Ou,
Daniel Lezcano
Emil Renner Berthing wrote:
> Thomas Gleixner wrote:
> > On Thu, Aug 15 2024 at 03:29, Emil Renner Berthing wrote:
> > > Thomas Gleixner wrote:
> > > Thanks for looking at this! Unfortunately the above patch isn't enough to fix
> > > the issue:
> > >
> > > https://termbin.com/7sgc
> > >
> > > It still hangs after the "[ 0.176451] cpuidle: using governor teo" message
> > > until the watchdog reboots the systems.
> >
> > So what's puzzling is that there is a timer installed early on:
> >
> > [ 0.000000] clocksource: riscv_clocksource: ....
> >
> > That same init function installs the per cpu riscv clockevent, so there
> > should be a timer available.
> >
> > The deffered probing of the PLIC driver delays obviously the probing of
> > the sun4i timer, but that should not matter when another timer is
> > available. So the sun4i driver might be a red herring.
> >
> > Can you please add "ignore_loglevel initcall_debug" to the command line
> > and provide the output of a booting and a failing kernel?
>
> 6.11-rc3 + these reverts: https://termbin.com/q6wk
> 6.11-rc3 + Samuel's patch: https://termbin.com/7cgs
I think this confirms what Charlie found here:
https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/
>
> > And on the booting kernel please provide the output from:
> >
> > # cat /sys/devices/system/clockevents/clockevent0/current_device
> > # cat /sys/devices/system/clockevents/broadcast/current_device
> > # cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>
> On both a 6.8.6 kernel and 6.11-rc3 + reverts I get:
>
> # cat /sys/devices/system/clockevents/clockevent0/current_device
> sun4i_tick
> # cat /sys/devices/system/clockevents/broadcast/current_device
> riscv_timer_clockevent
> # cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> riscv_clocksource
>
> /Emil
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 12:14 ` Emil Renner Berthing
@ 2024-08-15 13:16 ` Thomas Gleixner
2024-08-15 13:32 ` Samuel Holland
2024-08-15 13:35 ` Emil Renner Berthing
0 siblings, 2 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-08-15 13:16 UTC (permalink / raw)
To: Emil Renner Berthing, Emil Renner Berthing, linux-kernel,
linux-riscv, Anup Patel
Cc: Paul Walmsley, Samuel Holland, Palmer Dabbelt, Albert Ou,
Daniel Lezcano
On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote:
> Emil Renner Berthing wrote:
>> 6.11-rc3 + these reverts: https://termbin.com/q6wk
>> 6.11-rc3 + Samuel's patch: https://termbin.com/7cgs
>
> I think this confirms what Charlie found here:
> https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/
Yes. So the riscv timer is not working on this thing or it stops
somehow.
Can you apply the debug patch below and check whether you see the
'J: ....' output at all and if so whether it stops at some point.
Thanks,
tglx
---
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -2459,6 +2459,9 @@ static void run_local_timers(void)
{
struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]);
+ if (!(jiffies & 0xFF))
+ pr_info("J: %lx\n", jiffies);
+
hrtimer_run_queues();
for (int i = 0; i < NR_BASES; i++, base++) {
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 13:16 ` Thomas Gleixner
@ 2024-08-15 13:32 ` Samuel Holland
2024-08-15 14:11 ` Thomas Gleixner
2024-08-15 14:30 ` Anup Patel
2024-08-15 13:35 ` Emil Renner Berthing
1 sibling, 2 replies; 36+ messages in thread
From: Samuel Holland @ 2024-08-15 13:32 UTC (permalink / raw)
To: Thomas Gleixner, Emil Renner Berthing, linux-kernel, linux-riscv,
Anup Patel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Daniel Lezcano
Hi Thomas, Emil,
On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
> On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote:
>> Emil Renner Berthing wrote:
>>> 6.11-rc3 + these reverts: https://us01.z.antigena.com/l/Er4kZWDmvL5-bLzHHJoZv0k71iwW2jCD5qNpiz0x0XdYY6oORF_nXh7U7jw6oubhi~32HI4i71jUW9v8~NvSvPeUWrdYx3WJBr2GPDUjOu6LYPCOBfR2dVQuMWvlNj4tDjXFp3QEQAmeawZflD4JrIJjtSYIbKfe6v-tgH7SEuHMeSSriU633Lv
>>> 6.11-rc3 + Samuel's patch: https://us01.z.antigena.com/l/EULtAYky6ZvgqZ49KGS-WBsYTg~Ht1NoQtEYmUVb56ymS9jDagqYHLK90WDjnVt69GfB4IX5NSRQXmSfkNsTzB8lJmFvDihHQmGrsCv9FzlorD9yGfXDlQ6rG6vmn5BNDwlipmssGaOGfh9yko8n9ArWR4TLhEf~f9ODqme~NXXwA9DLLc9p
>>
>> I think this confirms what Charlie found here:
>> https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/
>
> Yes. So the riscv timer is not working on this thing or it stops
> somehow.
That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
firmware does not have a timer device, so it does not expose the (optional[1])
SBI time extension, and sbi_set_timer() does nothing.
I wrote a patch (not submitted) to skip registering riscv_clock_event when the
SBI time extension is unavailable, but this doesn't fully solve the issue
either, because then we have no clockevent at all when
check_unaligned_access_all_cpus() is called.
How early in the boot process are we "required" to have a functional clockevent?
Do we need to refactor check_unaligned_access_all_cpus() so it works on systems
where the only clockevent is provided by a platform device?
Regards,
Samuel
[1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/intro.adoc
> Can you apply the debug patch below and check whether you see the
> 'J: ....' output at all and if so whether it stops at some point.
>
> Thanks,
>
> tglx
>
> ---
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -2459,6 +2459,9 @@ static void run_local_timers(void)
> {
> struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]);
>
> + if (!(jiffies & 0xFF))
> + pr_info("J: %lx\n", jiffies);
> +
> hrtimer_run_queues();
>
> for (int i = 0; i < NR_BASES; i++, base++) {
>
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 13:32 ` Samuel Holland
@ 2024-08-15 14:11 ` Thomas Gleixner
2024-08-15 14:16 ` Anup Patel
2024-08-15 14:30 ` Anup Patel
1 sibling, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2024-08-15 14:11 UTC (permalink / raw)
To: Samuel Holland, Emil Renner Berthing, linux-kernel, linux-riscv,
Anup Patel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Daniel Lezcano
On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote:
> On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
>> Yes. So the riscv timer is not working on this thing or it stops
>> somehow.
>
> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
> firmware does not have a timer device, so it does not expose the (optional[1])
> SBI time extension, and sbi_set_timer() does nothing.
Sigh. Does RISCV really have to repeat all mistakes which have been made
by x86, ARM and others before? It's known for decades that the kernel
relies on a working timer...
> I wrote a patch (not submitted) to skip registering riscv_clock_event when the
> SBI time extension is unavailable, but this doesn't fully solve the issue
> either, because then we have no clockevent at all when
> check_unaligned_access_all_cpus() is called.
check_unaligned_access_all_cpus() is irrelevant.
> How early in the boot process are we "required" to have a functional clockevent?
> Do we need to refactor check_unaligned_access_all_cpus() so it works on systems
> where the only clockevent is provided by a platform device?
Right after init/main::late_time_init() everything can depend on a
working timer and on jiffies increasing.
I'm actually surprised that the boot process gets that far. That's just
by pure luck, really.
Thanks,
tglx
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 14:11 ` Thomas Gleixner
@ 2024-08-15 14:16 ` Anup Patel
2024-08-15 14:41 ` Samuel Holland
0 siblings, 1 reply; 36+ messages in thread
From: Anup Patel @ 2024-08-15 14:16 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Samuel Holland, Emil Renner Berthing, linux-kernel, linux-riscv,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Daniel Lezcano
On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote:
> > On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
> >> Yes. So the riscv timer is not working on this thing or it stops
> >> somehow.
> >
> > That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
> > firmware does not have a timer device, so it does not expose the (optional[1])
> > SBI time extension, and sbi_set_timer() does nothing.
>
> Sigh. Does RISCV really have to repeat all mistakes which have been made
> by x86, ARM and others before? It's known for decades that the kernel
> relies on a working timer...
My apologies for the delay in finding a fix for this issue.
Almost all RISC-V platforms (except this one) have SBI Timer always
available and Linux uses a better timer or Sstc extension whenever
it is available.
When Emil first reported this issue, I did try to help him root cause
the issue but unfortunately I don't have this particular platform and
PLIC on all other RISC-V platforms works fine.
I am also surprised that none of the Allwiner folks tried helping.
>
> > I wrote a patch (not submitted) to skip registering riscv_clock_event when the
> > SBI time extension is unavailable, but this doesn't fully solve the issue
> > either, because then we have no clockevent at all when
> > check_unaligned_access_all_cpus() is called.
>
> check_unaligned_access_all_cpus() is irrelevant.
>
> > How early in the boot process are we "required" to have a functional clockevent?
> > Do we need to refactor check_unaligned_access_all_cpus() so it works on systems
> > where the only clockevent is provided by a platform device?
>
> Right after init/main::late_time_init() everything can depend on a
> working timer and on jiffies increasing.
>
> I'm actually surprised that the boot process gets that far. That's just
> by pure luck, really.
>
> Thanks,
>
> tglx
Regards,
Anup
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 14:16 ` Anup Patel
@ 2024-08-15 14:41 ` Samuel Holland
2024-08-15 15:07 ` Emil Renner Berthing
2024-08-15 15:14 ` Thomas Gleixner
0 siblings, 2 replies; 36+ messages in thread
From: Samuel Holland @ 2024-08-15 14:41 UTC (permalink / raw)
To: Anup Patel, Thomas Gleixner, Emil Renner Berthing
Cc: linux-kernel, linux-riscv, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Daniel Lezcano
On 2024-08-15 9:16 AM, Anup Patel wrote:
> On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote:
>>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
>>>> Yes. So the riscv timer is not working on this thing or it stops
>>>> somehow.
>>>
>>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
>>> firmware does not have a timer device, so it does not expose the (optional[1])
>>> SBI time extension, and sbi_set_timer() does nothing.
>>
>> Sigh. Does RISCV really have to repeat all mistakes which have been made
>> by x86, ARM and others before? It's known for decades that the kernel
>> relies on a working timer...
>
> My apologies for the delay in finding a fix for this issue.
>
> Almost all RISC-V platforms (except this one) have SBI Timer always
> available and Linux uses a better timer or Sstc extension whenever
> it is available.
So this is the immediate solution: add the CLINT to the firmware devicetree so
that the SBI time extension works, and Linux will boot without any code changes,
albeit with a higher-overhead clockevent device.
Additionally merging the sun4i timer patch[1] will allow the system to switch to
the better MMIO clocksource later in the boot process.
The reason the CLINT was not added to the devicetree already is that the T-HEAD
version of the CLINT includes an extension to drive SSIP/STIP from a second
S-mode visible set of registers. So it should really have twice as many entries
in its interrupts-extended property as the existing CLINT, and I never got
around to validating that this would work.
The long-term solution would be adding driver support for the T-HEAD CLINT
extensions, which provide an even better clockevent than the sun4i timer.
[1]: https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com/
> When Emil first reported this issue, I did try to help him root cause
> the issue but unfortunately I don't have this particular platform and
> PLIC on all other RISC-V platforms works fine.
>
> I am also surprised that none of the Allwinner folks tried helping.
Allwinner D1 support was upstreamed by unpaid hobbyists with very little
first-party assistance.
>>> I wrote a patch (not submitted) to skip registering riscv_clock_event when the
>>> SBI time extension is unavailable, but this doesn't fully solve the issue
>>> either, because then we have no clockevent at all when
>>> check_unaligned_access_all_cpus() is called.
>>
>> check_unaligned_access_all_cpus() is irrelevant.
>>
>>> How early in the boot process are we "required" to have a functional clockevent?
>>> Do we need to refactor check_unaligned_access_all_cpus() so it works on systems
>>> where the only clockevent is provided by a platform device?
>>
>> Right after init/main::late_time_init() everything can depend on a
>> working timer and on jiffies increasing.
>>
>> I'm actually surprised that the boot process gets that far. That's just
>> by pure luck, really.
Thanks for clearing this up!
Regards,
Samuel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 14:41 ` Samuel Holland
@ 2024-08-15 15:07 ` Emil Renner Berthing
2024-08-15 15:59 ` Samuel Holland
2024-08-15 15:14 ` Thomas Gleixner
1 sibling, 1 reply; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-15 15:07 UTC (permalink / raw)
To: Samuel Holland, Anup Patel, Thomas Gleixner, Emil Renner Berthing
Cc: linux-kernel, linux-riscv, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Daniel Lezcano
Samuel Holland wrote:
> On 2024-08-15 9:16 AM, Anup Patel wrote:
> > On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >>
> >> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote:
> >>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
> >>>> Yes. So the riscv timer is not working on this thing or it stops
> >>>> somehow.
> >>>
> >>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
> >>> firmware does not have a timer device, so it does not expose the (optional[1])
> >>> SBI time extension, and sbi_set_timer() does nothing.
> >>
> >> Sigh. Does RISCV really have to repeat all mistakes which have been made
> >> by x86, ARM and others before? It's known for decades that the kernel
> >> relies on a working timer...
> >
> > My apologies for the delay in finding a fix for this issue.
> >
> > Almost all RISC-V platforms (except this one) have SBI Timer always
> > available and Linux uses a better timer or Sstc extension whenever
> > it is available.
>
> So this is the immediate solution: add the CLINT to the firmware devicetree so
> that the SBI time extension works, and Linux will boot without any code changes,
> albeit with a higher-overhead clockevent device.
But this will mean that you can't update your kernel to v6.9 or newer without
reflashing OpenSBI and u-boot. That's still a regression right?
/Emil
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 15:07 ` Emil Renner Berthing
@ 2024-08-15 15:59 ` Samuel Holland
2024-08-15 17:51 ` Palmer Dabbelt
0 siblings, 1 reply; 36+ messages in thread
From: Samuel Holland @ 2024-08-15 15:59 UTC (permalink / raw)
To: Emil Renner Berthing, Anup Patel, Thomas Gleixner
Cc: linux-kernel, linux-riscv, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Daniel Lezcano
Hi Emil,
On 2024-08-15 10:07 AM, Emil Renner Berthing wrote:
> Samuel Holland wrote:
>> On 2024-08-15 9:16 AM, Anup Patel wrote:
>>> On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>>>
>>>> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote:
>>>>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
>>>>>> Yes. So the riscv timer is not working on this thing or it stops
>>>>>> somehow.
>>>>>
>>>>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
>>>>> firmware does not have a timer device, so it does not expose the (optional[1])
>>>>> SBI time extension, and sbi_set_timer() does nothing.
>>>>
>>>> Sigh. Does RISCV really have to repeat all mistakes which have been made
>>>> by x86, ARM and others before? It's known for decades that the kernel
>>>> relies on a working timer...
>>>
>>> My apologies for the delay in finding a fix for this issue.
>>>
>>> Almost all RISC-V platforms (except this one) have SBI Timer always
>>> available and Linux uses a better timer or Sstc extension whenever
>>> it is available.
>>
>> So this is the immediate solution: add the CLINT to the firmware devicetree so
>> that the SBI time extension works, and Linux will boot without any code changes,
>> albeit with a higher-overhead clockevent device.
>
> But this will mean that you can't update your kernel to v6.9 or newer without
> reflashing OpenSBI and u-boot. That's still a regression right?
I suppose that depends on if you think the SBI time extension is (or should have
been) mandatory for platforms without Sstc. If the SBI time extension is
mandatory, then this is a firmware bug, and not really Linux's responsibility to
work around.
If the SBI time extension is not mandatory, then Linux needs to be able to
handle platforms where the S-mode visible timer is attached to an external
interrupt controller (PLIC or APLIC), so the irqchip driver needs to be loaded
before time_init() (timer_probe()). So in that case, the bug is a Linux
regression, and we would need to revert the platform driver conversion.
Regards,
Samuel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 15:59 ` Samuel Holland
@ 2024-08-15 17:51 ` Palmer Dabbelt
2024-08-15 18:04 ` Thomas Gleixner
2024-08-16 6:13 ` Icenowy Zheng
0 siblings, 2 replies; 36+ messages in thread
From: Palmer Dabbelt @ 2024-08-15 17:51 UTC (permalink / raw)
To: samuel.holland
Cc: Renner Berthing, apatel, tglx, linux-kernel, linux-riscv,
Paul Walmsley, aou, daniel.lezcano
On Thu, 15 Aug 2024 08:59:37 PDT (-0700), samuel.holland@sifive.com wrote:
> Hi Emil,
>
> On 2024-08-15 10:07 AM, Emil Renner Berthing wrote:
>> Samuel Holland wrote:
>>> On 2024-08-15 9:16 AM, Anup Patel wrote:
>>>> On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>>>>
>>>>> On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote:
>>>>>> On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
>>>>>>> Yes. So the riscv timer is not working on this thing or it stops
>>>>>>> somehow.
>>>>>>
>>>>>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
>>>>>> firmware does not have a timer device, so it does not expose the (optional[1])
>>>>>> SBI time extension, and sbi_set_timer() does nothing.
>>>>>
>>>>> Sigh. Does RISCV really have to repeat all mistakes which have been made
>>>>> by x86, ARM and others before? It's known for decades that the kernel
>>>>> relies on a working timer...
It's even worse than that: RISC-V doesn't even mandate any working
_instructions_, much less anything in the platform/firmware.
>>>> My apologies for the delay in finding a fix for this issue.
>>>>
>>>> Almost all RISC-V platforms (except this one) have SBI Timer always
>>>> available and Linux uses a better timer or Sstc extension whenever
>>>> it is available.
>>>
>>> So this is the immediate solution: add the CLINT to the firmware devicetree so
>>> that the SBI time extension works, and Linux will boot without any code changes,
>>> albeit with a higher-overhead clockevent device.
>>
>> But this will mean that you can't update your kernel to v6.9 or newer without
>> reflashing OpenSBI and u-boot. That's still a regression right?
Ya, I'd call that a regression. Updating the firmware on these things
isn't generally something we can rely on users to do, we've worked
around other firmware bugs where we can to avoid forced updates.
> I suppose that depends on if you think the SBI time extension is (or should have
> been) mandatory for platforms without Sstc. If the SBI time extension is
> mandatory, then this is a firmware bug, and not really Linux's responsibility to
> work around.
>
> If the SBI time extension is not mandatory, then Linux needs to be able to
> handle platforms where the S-mode visible timer is attached to an external
> interrupt controller (PLIC or APLIC), so the irqchip driver needs to be loaded
> before time_init() (timer_probe()). So in that case, the bug is a Linux
> regression, and we would need to revert the platform driver conversion.
It doesn't really matter what the specs say (aka intended to say in
RISC-V land): if there's a regression then we have to deal with it.
It's not like whatever's written in the specs actually matters, vendors
can just do whatever they want, so wer'e just stuck making the known
implementations work.
So I think if the revert is the best fix then we should revert it.
That said: If the CLINT works, could we just add a probing quirk to make
it appear on these systems even when it's not in the DT? I'm thinking
something like adding a compatibly string to the CLINT driver for the
SOC (or core or whatever, just something that's already there). We'd
probably need a bit of special-case probing code, but shouldn't be so
bad. We've got some other compatibility-oriented DT quirks floating
around.
> Regards,
> Samuel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 17:51 ` Palmer Dabbelt
@ 2024-08-15 18:04 ` Thomas Gleixner
2024-08-16 6:13 ` Icenowy Zheng
1 sibling, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-08-15 18:04 UTC (permalink / raw)
To: Palmer Dabbelt, samuel.holland
Cc: Renner Berthing, apatel, linux-kernel, linux-riscv, Paul Walmsley,
aou, daniel.lezcano
On Thu, Aug 15 2024 at 10:51, Palmer Dabbelt wrote:
> On Thu, 15 Aug 2024 08:59:37 PDT (-0700), samuel.holland@sifive.com wrote:
>>>>>> Sigh. Does RISCV really have to repeat all mistakes which have been made
>>>>>> by x86, ARM and others before? It's known for decades that the kernel
>>>>>> relies on a working timer...
>
> It's even worse than that: RISC-V doesn't even mandate any working
> _instructions_, much less anything in the platform/firmware.
So it's definitely taking the award for architectural disaster and will
probably keep it for a while.
> So I think if the revert is the best fix then we should revert it.
>
> That said: If the CLINT works, could we just add a probing quirk to make
> it appear on these systems even when it's not in the DT? I'm thinking
> something like adding a compatibly string to the CLINT driver for the
> SOC (or core or whatever, just something that's already there). We'd
> probably need a bit of special-case probing code, but shouldn't be so
> bad. We've got some other compatibility-oriented DT quirks floating
> around.
Alternatively, you can have a quirk in the PLIC driver for that
Allwinner D1 chip which probes it via IRQCHIP_DECLARE() as before with a
special probe function and denies the later platform probe.
Thanks,
tglx
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 17:51 ` Palmer Dabbelt
2024-08-15 18:04 ` Thomas Gleixner
@ 2024-08-16 6:13 ` Icenowy Zheng
1 sibling, 0 replies; 36+ messages in thread
From: Icenowy Zheng @ 2024-08-16 6:13 UTC (permalink / raw)
To: Palmer Dabbelt, samuel.holland
Cc: Renner Berthing, apatel, tglx, linux-kernel, linux-riscv,
Paul Walmsley, aou, daniel.lezcano
在 2024-08-15星期四的 10:51 -0700,Palmer Dabbelt写道:
> On Thu, 15 Aug 2024 08:59:37 PDT (-0700),
> samuel.holland@sifive.com wrote:
> > Hi Emil,
> >
> > On 2024-08-15 10:07 AM, Emil Renner Berthing wrote:
> > > Samuel Holland wrote:
> > > > On 2024-08-15 9:16 AM, Anup Patel wrote:
> > > > > On Thu, Aug 15, 2024 at 7:41 PM Thomas Gleixner
> > > > > <tglx@linutronix.de> wrote:
> > > > > >
> > > > > > On Thu, Aug 15 2024 at 08:32, Samuel Holland wrote:
> > > > > > > On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
> > > > > > > > Yes. So the riscv timer is not working on this thing or
> > > > > > > > it stops
> > > > > > > > somehow.
> > > > > > >
> > > > > > > That's correct. With the (firmware) devicetree that Emil
> > > > > > > is using, the OpenSBI
> > > > > > > firmware does not have a timer device, so it does not
> > > > > > > expose the (optional[1])
> > > > > > > SBI time extension, and sbi_set_timer() does nothing.
> > > > > >
> > > > > > Sigh. Does RISCV really have to repeat all mistakes which
> > > > > > have been made
> > > > > > by x86, ARM and others before? It's known for decades that
> > > > > > the kernel
> > > > > > relies on a working timer...
>
> It's even worse than that: RISC-V doesn't even mandate any working
> _instructions_, much less anything in the platform/firmware.
>
> > > > > My apologies for the delay in finding a fix for this issue.
> > > > >
> > > > > Almost all RISC-V platforms (except this one) have SBI Timer
> > > > > always
> > > > > available and Linux uses a better timer or Sstc extension
> > > > > whenever
> > > > > it is available.
> > > >
> > > > So this is the immediate solution: add the CLINT to the
> > > > firmware devicetree so
> > > > that the SBI time extension works, and Linux will boot without
> > > > any code changes,
> > > > albeit with a higher-overhead clockevent device.
> > >
> > > But this will mean that you can't update your kernel to v6.9 or
> > > newer without
> > > reflashing OpenSBI and u-boot. That's still a regression right?
>
> Ya, I'd call that a regression. Updating the firmware on these
> things
> isn't generally something we can rely on users to do, we've worked
> around other firmware bugs where we can to avoid forced updates.
>
> > I suppose that depends on if you think the SBI time extension is
> > (or should have
> > been) mandatory for platforms without Sstc. If the SBI time
> > extension is
> > mandatory, then this is a firmware bug, and not really Linux's
> > responsibility to
> > work around.
> >
> > If the SBI time extension is not mandatory, then Linux needs to be
> > able to
> > handle platforms where the S-mode visible timer is attached to an
> > external
> > interrupt controller (PLIC or APLIC), so the irqchip driver needs
> > to be loaded
> > before time_init() (timer_probe()). So in that case, the bug is a
> > Linux
> > regression, and we would need to revert the platform driver
> > conversion.
>
> It doesn't really matter what the specs say (aka intended to say in
> RISC-V land): if there's a regression then we have to deal with it.
> It's not like whatever's written in the specs actually matters,
> vendors
> can just do whatever they want, so wer'e just stuck making the known
> implementations work.
>
> So I think if the revert is the best fix then we should revert it.
>
> That said: If the CLINT works, could we just add a probing quirk to
> make
CLINT works, but will will never work in S mode by its design -- the
register used is all M-mode-only.
So it this kind of probing quirk is being added, the target will be
OpenSBI instead of Linux, and the problem of updating firmware still
exists.
> it appear on these systems even when it's not in the DT? I'm
> thinking
> something like adding a compatibly string to the CLINT driver for the
> SOC (or core or whatever, just something that's already there). We'd
> probably need a bit of special-case probing code, but shouldn't be so
> bad. We've got some other compatibility-oriented DT quirks floating
> around.
>
> > Regards,
> > Samuel
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 14:41 ` Samuel Holland
2024-08-15 15:07 ` Emil Renner Berthing
@ 2024-08-15 15:14 ` Thomas Gleixner
1 sibling, 0 replies; 36+ messages in thread
From: Thomas Gleixner @ 2024-08-15 15:14 UTC (permalink / raw)
To: Samuel Holland, Anup Patel, Emil Renner Berthing
Cc: linux-kernel, linux-riscv, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Daniel Lezcano
On Thu, Aug 15 2024 at 09:41, Samuel Holland wrote:
> On 2024-08-15 9:16 AM, Anup Patel wrote:
>> Almost all RISC-V platforms (except this one) have SBI Timer always
>> available and Linux uses a better timer or Sstc extension whenever
>> it is available.
>
> So this is the immediate solution: add the CLINT to the firmware
> devicetree so that the SBI time extension works, and Linux will boot
> without any code changes, albeit with a higher-overhead clockevent
> device.
That does not matter for the boot process when the sun4i timer becomes
available afterwards. But how can this be retrofitted along with the
kernel update?
Thanks,
tglx
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 13:32 ` Samuel Holland
2024-08-15 14:11 ` Thomas Gleixner
@ 2024-08-15 14:30 ` Anup Patel
2024-08-15 15:03 ` Samuel Holland
2024-08-16 6:09 ` Icenowy Zheng
1 sibling, 2 replies; 36+ messages in thread
From: Anup Patel @ 2024-08-15 14:30 UTC (permalink / raw)
To: Samuel Holland
Cc: Thomas Gleixner, Emil Renner Berthing, linux-kernel, linux-riscv,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Daniel Lezcano
On Thu, Aug 15, 2024 at 7:02 PM Samuel Holland
<samuel.holland@sifive.com> wrote:
>
> Hi Thomas, Emil,
>
> On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
> > On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote:
> >> Emil Renner Berthing wrote:
> >>> 6.11-rc3 + these reverts: https://us01.z.antigena.com/l/Er4kZWDmvL5-bLzHHJoZv0k71iwW2jCD5qNpiz0x0XdYY6oORF_nXh7U7jw6oubhi~32HI4i71jUW9v8~NvSvPeUWrdYx3WJBr2GPDUjOu6LYPCOBfR2dVQuMWvlNj4tDjXFp3QEQAmeawZflD4JrIJjtSYIbKfe6v-tgH7SEuHMeSSriU633Lv
> >>> 6.11-rc3 + Samuel's patch: https://us01.z.antigena.com/l/EULtAYky6ZvgqZ49KGS-WBsYTg~Ht1NoQtEYmUVb56ymS9jDagqYHLK90WDjnVt69GfB4IX5NSRQXmSfkNsTzB8lJmFvDihHQmGrsCv9FzlorD9yGfXDlQ6rG6vmn5BNDwlipmssGaOGfh9yko8n9ArWR4TLhEf~f9ODqme~NXXwA9DLLc9p
> >>
> >> I think this confirms what Charlie found here:
> >> https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/
> >
> > Yes. So the riscv timer is not working on this thing or it stops
> > somehow.
>
> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
> firmware does not have a timer device, so it does not expose the (optional[1])
> SBI time extension, and sbi_set_timer() does nothing.
OpenSBI uses platform specific M-mode timer (mtime and mtimecmp) to
provide SBI time extension to Linux.
The RISC-V privileged specification (v1.10 or higher) requires platform to
provide a M-mode timer (mtime and mtimecmp).
This platform not having any M-mode timer is yet another RISC-V spec
violation by this platform.
Regards,
Anup
>
> I wrote a patch (not submitted) to skip registering riscv_clock_event when the
> SBI time extension is unavailable, but this doesn't fully solve the issue
> either, because then we have no clockevent at all when
> check_unaligned_access_all_cpus() is called.
>
> How early in the boot process are we "required" to have a functional clockevent?
> Do we need to refactor check_unaligned_access_all_cpus() so it works on systems
> where the only clockevent is provided by a platform device?
>
> Regards,
> Samuel
>
> [1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/intro.adoc
>
> > Can you apply the debug patch below and check whether you see the
> > 'J: ....' output at all and if so whether it stops at some point.
> >
> > Thanks,
> >
> > tglx
> >
> > ---
> > --- a/kernel/time/timer.c
> > +++ b/kernel/time/timer.c
> > @@ -2459,6 +2459,9 @@ static void run_local_timers(void)
> > {
> > struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]);
> >
> > + if (!(jiffies & 0xFF))
> > + pr_info("J: %lx\n", jiffies);
> > +
> > hrtimer_run_queues();
> >
> > for (int i = 0; i < NR_BASES; i++, base++) {
> >
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
>
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 14:30 ` Anup Patel
@ 2024-08-15 15:03 ` Samuel Holland
2024-08-15 15:53 ` Anup Patel
2024-08-16 6:09 ` Icenowy Zheng
1 sibling, 1 reply; 36+ messages in thread
From: Samuel Holland @ 2024-08-15 15:03 UTC (permalink / raw)
To: Anup Patel
Cc: Thomas Gleixner, Emil Renner Berthing, linux-kernel, linux-riscv,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Daniel Lezcano
Hi Anup,
On 2024-08-15 9:30 AM, Anup Patel wrote:
> On Thu, Aug 15, 2024 at 7:02 PM Samuel Holland
> <samuel.holland@sifive.com> wrote:
>>> Yes. So the riscv timer is not working on this thing or it stops
>>> somehow.
>>
>> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
>> firmware does not have a timer device, so it does not expose the (optional[1])
>> SBI time extension, and sbi_set_timer() does nothing.
>
> OpenSBI uses platform specific M-mode timer (mtime and mtimecmp) to
> provide SBI time extension to Linux.
>
> The RISC-V privileged specification (v1.10 or higher) requires platform to
> provide a M-mode timer (mtime and mtimecmp).
>
> This platform not having any M-mode timer is yet another RISC-V spec
> violation by this platform.
You've misunderstood here. Allwinner D1 (T-HEAD C906) _does_ have an M-mode
timer (a CLINT). It is just omitted from devicetree that Emil happens to be
using, so OpenSBI isn't using it.
Currently OpenSBI allows the system to boot without a timer device, and the SBI
specification does not mandate the time extension. If consensus is that either
of these should change, that's fine, but currently I see nothing in either the
privileged spec nor the SBI spec that guarantees the availability of some timer
to the kernel in S-mode.
Regards,
Samuel
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 15:03 ` Samuel Holland
@ 2024-08-15 15:53 ` Anup Patel
0 siblings, 0 replies; 36+ messages in thread
From: Anup Patel @ 2024-08-15 15:53 UTC (permalink / raw)
To: Samuel Holland
Cc: Thomas Gleixner, Emil Renner Berthing, linux-kernel, linux-riscv,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Daniel Lezcano
On Thu, Aug 15, 2024 at 8:33 PM Samuel Holland
<samuel.holland@sifive.com> wrote:
>
> Hi Anup,
>
> On 2024-08-15 9:30 AM, Anup Patel wrote:
> > On Thu, Aug 15, 2024 at 7:02 PM Samuel Holland
> > <samuel.holland@sifive.com> wrote:
> >>> Yes. So the riscv timer is not working on this thing or it stops
> >>> somehow.
> >>
> >> That's correct. With the (firmware) devicetree that Emil is using, the OpenSBI
> >> firmware does not have a timer device, so it does not expose the (optional[1])
> >> SBI time extension, and sbi_set_timer() does nothing.
> >
> > OpenSBI uses platform specific M-mode timer (mtime and mtimecmp) to
> > provide SBI time extension to Linux.
> >
> > The RISC-V privileged specification (v1.10 or higher) requires platform to
> > provide a M-mode timer (mtime and mtimecmp).
> >
> > This platform not having any M-mode timer is yet another RISC-V spec
> > violation by this platform.
>
> You've misunderstood here. Allwinner D1 (T-HEAD C906) _does_ have an M-mode
> timer (a CLINT). It is just omitted from devicetree that Emil happens to be
> using, so OpenSBI isn't using it.
>
> Currently OpenSBI allows the system to boot without a timer device, and the SBI
> specification does not mandate the time extension. If consensus is that either
> of these should change, that's fine, but currently I see nothing in either the
> privileged spec nor the SBI spec that guarantees the availability of some timer
> to the kernel in S-mode.
>
The SBI time is certainly optional hence OpenSBI does not hang or crash if
it can't provide SBI time to supervisor software.
My comment is from the RISC-V privileged spec perspective:
1) Priv v1.10 says "Platforms provide a real-time counter, exposed as a
memory-mapped machine-mode register, mtime." in section "3.1.15 Machine
Timer Registers (mtime and mtimecmp)".
2) Similar statement in Priv v1.11 section "3.1.10 Machine Timer Registers
(mtime and mtimecmp)"
3) Similar statement in Priv v1.12 section "3.2.1 Machine Timer Registers
(mtime and mtimecmp)"
But since the M-mode timer was omitted from the DT, I think the DT was
always incomplete from the M-mode perspective.
Regards,
Anup
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 14:30 ` Anup Patel
2024-08-15 15:03 ` Samuel Holland
@ 2024-08-16 6:09 ` Icenowy Zheng
1 sibling, 0 replies; 36+ messages in thread
From: Icenowy Zheng @ 2024-08-16 6:09 UTC (permalink / raw)
To: Anup Patel, Samuel Holland
Cc: Thomas Gleixner, Emil Renner Berthing, linux-kernel, linux-riscv,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Daniel Lezcano
在 2024-08-15星期四的 20:00 +0530,Anup Patel写道:
> On Thu, Aug 15, 2024 at 7:02 PM Samuel Holland
> <samuel.holland@sifive.com> wrote:
> >
> > Hi Thomas, Emil,
> >
> > On 2024-08-15 8:16 AM, Thomas Gleixner wrote:
> > > On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote:
> > > > Emil Renner Berthing wrote:
> > > > > 6.11-rc3 + these reverts:
> > > > > https://us01.z.antigena.com/l/Er4kZWDmvL5-bLzHHJoZv0k71iwW2jCD5qNpiz0x0XdYY6oORF_nXh7U7jw6oubhi~32HI4i71jUW9v8~NvSvPeUWrdYx3WJBr2GPDUjOu6LYPCOBfR2dVQuMWvlNj4tDjXFp3QEQAmeawZflD4JrIJjtSYIbKfe6v-tgH7SEuHMeSSriU633Lv
> > > > > 6.11-rc3 + Samuel's patch:
> > > > > https://us01.z.antigena.com/l/EULtAYky6ZvgqZ49KGS-WBsYTg~Ht1NoQtEYmUVb56ymS9jDagqYHLK90WDjnVt69GfB4IX5NSRQXmSfkNsTzB8lJmFvDihHQmGrsCv9FzlorD9yGfXDlQ6rG6vmn5BNDwlipmssGaOGfh9yko8n9ArWR4TLhEf~f9ODqme~NXXwA9DLLc9p
> > > >
> > > > I think this confirms what Charlie found here:
> > > > https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/
> > >
> > > Yes. So the riscv timer is not working on this thing or it stops
> > > somehow.
> >
> > That's correct. With the (firmware) devicetree that Emil is using,
> > the OpenSBI
> > firmware does not have a timer device, so it does not expose the
> > (optional[1])
> > SBI time extension, and sbi_set_timer() does nothing.
>
> OpenSBI uses platform specific M-mode timer (mtime and mtimecmp) to
> provide SBI time extension to Linux.
>
> The RISC-V privileged specification (v1.10 or higher) requires
> platform to
> provide a M-mode timer (mtime and mtimecmp).
The T-Head cores' design is weird, and because of its source code is
available (as OpenC906), we can investigate it further in the RTL
level:
- From software perspective, it has no mtime mmap'ed register, but it
has mtimecmp, which compares with time CSR (a CSR R/O in all priv
levels).
- From SoC integration perspective, the value of the time CSR is
sourced from an external input, pad_cpu_sys_cnt[63:0] [1].
BTW I already added support for this kind of non-standard CLINT to
OpenSBI [2], however I don't think the current firmware DT of D1
utilizes it; and this is also a quite new SBI version I think.
[1]
https://github.com/XUANTIE-RV/openc906/blob/main/C906_RTL_FACTORY/gen_rtl/cpu/rtl/openC906.v#L84
[2]
https://github.com/riscv-software-src/opensbi/commit/b848d8763a737de44b64bfc036c8f51200226440
>
> This platform not having any M-mode timer is yet another RISC-V spec
> violation by this platform.
>
> Regards,
> Anup
>
> >
> > I wrote a patch (not submitted) to skip registering
> > riscv_clock_event when the
> > SBI time extension is unavailable, but this doesn't fully solve the
> > issue
> > either, because then we have no clockevent at all when
> > check_unaligned_access_all_cpus() is called.
> >
> > How early in the boot process are we "required" to have a
> > functional clockevent?
> > Do we need to refactor check_unaligned_access_all_cpus() so it
> > works on systems
> > where the only clockevent is provided by a platform device?
> >
> > Regards,
> > Samuel
> >
> > [1]
> > https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/intro.adoc
> >
> > > Can you apply the debug patch below and check whether you see the
> > > 'J: ....' output at all and if so whether it stops at some point.
> > >
> > > Thanks,
> > >
> > > tglx
> > >
> > > ---
> > > --- a/kernel/time/timer.c
> > > +++ b/kernel/time/timer.c
> > > @@ -2459,6 +2459,9 @@ static void run_local_timers(void)
> > > {
> > > struct timer_base *base =
> > > this_cpu_ptr(&timer_bases[BASE_LOCAL]);
> > >
> > > + if (!(jiffies & 0xFF))
> > > + pr_info("J: %lx\n", jiffies);
> > > +
> > > hrtimer_run_queues();
> > >
> > > for (int i = 0; i < NR_BASES; i++, base++) {
> > >
> > >
> > >
> > > _______________________________________________
> > > linux-riscv mailing list
> > > linux-riscv@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-riscv
> > >
> >
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 13:16 ` Thomas Gleixner
2024-08-15 13:32 ` Samuel Holland
@ 2024-08-15 13:35 ` Emil Renner Berthing
1 sibling, 0 replies; 36+ messages in thread
From: Emil Renner Berthing @ 2024-08-15 13:35 UTC (permalink / raw)
To: Thomas Gleixner, Emil Renner Berthing, linux-kernel, linux-riscv,
Anup Patel
Cc: Paul Walmsley, Samuel Holland, Palmer Dabbelt, Albert Ou,
Daniel Lezcano
Thomas Gleixner wrote:
> On Thu, Aug 15 2024 at 05:14, Emil Renner Berthing wrote:
> > Emil Renner Berthing wrote:
> >> 6.11-rc3 + these reverts: https://termbin.com/q6wk
> >> 6.11-rc3 + Samuel's patch: https://termbin.com/7cgs
> >
> > I think this confirms what Charlie found here:
> > https://lore.kernel.org/linux-riscv/ZoydV7vad5JWIcZb@ghost/
>
> Yes. So the riscv timer is not working on this thing or it stops
> somehow.
>
> Can you apply the debug patch below and check whether you see the
> 'J: ....' output at all and if so whether it stops at some point.
No, I don't see the J: ... debug output anywhere:
https://termbin.com/3vez
/Emil
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-14 17:30 ` [PATCH v1 0/9] Fix Allwinner D1 boot regression Thomas Gleixner
2024-08-15 10:29 ` Emil Renner Berthing
@ 2024-08-15 17:51 ` Palmer Dabbelt
2024-08-15 18:10 ` Thomas Gleixner
2024-08-16 6:15 ` Icenowy Zheng
1 sibling, 2 replies; 36+ messages in thread
From: Palmer Dabbelt @ 2024-08-15 17:51 UTC (permalink / raw)
To: tglx
Cc: Renner Berthing, linux-kernel, linux-riscv, apatel, Paul Walmsley,
samuel.holland, aou, daniel.lezcano
On Wed, 14 Aug 2024 10:30:48 PDT (-0700), tglx@linutronix.de wrote:
> On Wed, Aug 14 2024 at 16:56, Emil Renner Berthing wrote:
>> As described in the thread below[1] I haven't been able to boot my
>> boards based on the Allwinner D1 SoC since 6.9 where you converted the
>> SiFive PLIC driver to a platform driver.
>>
>> This is clearly a regression and there haven't really been much progress
>> on fixing the issue since then, so here is the revert that fixes it.
>>
>> If no other fix is found before 6.11 I suggest we apply this.
>
> So this mess has been ignored for two month now?
>
>>From the pastebin in the initial report:
>
> [ 0.000000] irq: no irq domain found for interrupt-controller@10000000 !
> [ 0.000000] Failed to map interrupt for /soc/timer@2050000
> [ 0.000000] Failed to initialize '/soc/timer@2050000': -22
>
> This comes back with -EINVAL. So the timer cannot find an interrupt,
> which makes it pretty obvious why the system stops to boot, unless there
> is some other timer available.
>
> This is obviously related to the SUN4I_TIMER because that message went
> away when it was disabled according to the next pastebin.
>
> Obviously that can't work because the SUN4I timer driver is using
> timer_of_init() which cannot handle deferred probing.
>
> Daniel: There was a partial fix for the sun4i driver, which you said you
> applied:
>
> https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com
>
> But that thing never materialized in a pull request.
>
> And of course everyone involved ignored the problem since March 13th
> 2024, i.e. almost half a year.
>
> Seriously?
>
> Can you RISCV folks get your act together and ensure to fix things you
> broke on the way? Especially when Emil reported this nobody pointed him
> to this patch and nobody noticed that it's still not merged?
>
> It took me less than 15 minutes to find that patch and the correlation,
> but this is absolutely not my job.
Sorry, I guess I'd just sort of been ignoring the platform-specific side
of things because it's so frustrating to deal with, but that's led to a
bunch of breakages so it's obviously the wrong thing to do.
> I'm seriously grumpy about that. This is not how it works. If you break
> stuff, then you take care to fix it before you shove more changes into
> the tree and waste my time.
>
> I'm very much inclined to take the reverts right now, send them to Linus
> for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv
> patches until the next merge window is over.
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
if you want to take the revert.
IIUC the patch above doesn't actually fix it, that's what led to just
sending the reverts -- at least reverts are better than breaking users.
I'll post over there too...
And it's no big deal if we're in the doghouse for a bit. Regressions
should get fixed faster than this, so we deserve it.
Probably also another sign we're way too focused on getting new features
merged, as that's coming at the expense of making existing platforms
work. IMO we've been way too focused on getting support for specs that
don't even have implementations, and not enough on building real working
systems.
> Emil, can you give that sun4i fix a test ride please?
>
> Thanks,
>
> tglx
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 17:51 ` Palmer Dabbelt
@ 2024-08-15 18:10 ` Thomas Gleixner
2024-08-15 23:04 ` Palmer Dabbelt
2024-08-16 6:15 ` Icenowy Zheng
1 sibling, 1 reply; 36+ messages in thread
From: Thomas Gleixner @ 2024-08-15 18:10 UTC (permalink / raw)
To: Palmer Dabbelt
Cc: Renner Berthing, linux-kernel, linux-riscv, apatel, Paul Walmsley,
samuel.holland, aou, daniel.lezcano
On Thu, Aug 15 2024 at 10:51, Palmer Dabbelt wrote:
> On Wed, 14 Aug 2024 10:30:48 PDT (-0700), tglx@linutronix.de wrote:
>> I'm very much inclined to take the reverts right now, send them to Linus
>> for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv
>> patches until the next merge window is over.
>
> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
>
> if you want to take the revert.
I'm happy to wait a week and see whether someone gets that CLINT hack
working or as I suggested the D1 PLIC early probe quirk.
> IIUC the patch above doesn't actually fix it, that's what led to just
> sending the reverts -- at least reverts are better than breaking users.
> I'll post over there too...
Right. We figured that out by now :)
> And it's no big deal if we're in the doghouse for a bit. Regressions
> should get fixed faster than this, so we deserve it.
For a week I consider you probationers :)
> Probably also another sign we're way too focused on getting new features
> merged, as that's coming at the expense of making existing platforms
> work. IMO we've been way too focused on getting support for specs that
> don't even have implementations, and not enough on building real working
> systems.
RISCV is not alone with that. This whole industry is nuts about features
and forgets the stuff what matters.
Thanks,
tglx
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 18:10 ` Thomas Gleixner
@ 2024-08-15 23:04 ` Palmer Dabbelt
0 siblings, 0 replies; 36+ messages in thread
From: Palmer Dabbelt @ 2024-08-15 23:04 UTC (permalink / raw)
To: tglx
Cc: Renner Berthing, linux-kernel, linux-riscv, apatel, Paul Walmsley,
samuel.holland, aou, daniel.lezcano
On Thu, 15 Aug 2024 11:10:25 PDT (-0700), tglx@linutronix.de wrote:
> On Thu, Aug 15 2024 at 10:51, Palmer Dabbelt wrote:
>> On Wed, 14 Aug 2024 10:30:48 PDT (-0700), tglx@linutronix.de wrote:
>>> I'm very much inclined to take the reverts right now, send them to Linus
>>> for -rc5 tagged with cc: stable and ignore/nak any irqchip related riscv
>>> patches until the next merge window is over.
>>
>> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
>>
>> if you want to take the revert.
>
> I'm happy to wait a week and see whether someone gets that CLINT hack
> working or as I suggested the D1 PLIC early probe quirk.
>
>> IIUC the patch above doesn't actually fix it, that's what led to just
>> sending the reverts -- at least reverts are better than breaking users.
>> I'll post over there too...
>
> Right. We figured that out by now :)
>
>> And it's no big deal if we're in the doghouse for a bit. Regressions
>> should get fixed faster than this, so we deserve it.
>
> For a week I consider you probationers :)
Works for me ;)
>> Probably also another sign we're way too focused on getting new features
>> merged, as that's coming at the expense of making existing platforms
>> work. IMO we've been way too focused on getting support for specs that
>> don't even have implementations, and not enough on building real working
>> systems.
>
> RISCV is not alone with that. This whole industry is nuts about features
> and forgets the stuff what matters.
>
> Thanks,
>
> tglx
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-15 17:51 ` Palmer Dabbelt
2024-08-15 18:10 ` Thomas Gleixner
@ 2024-08-16 6:15 ` Icenowy Zheng
1 sibling, 0 replies; 36+ messages in thread
From: Icenowy Zheng @ 2024-08-16 6:15 UTC (permalink / raw)
To: Palmer Dabbelt, tglx
Cc: Renner Berthing, linux-kernel, linux-riscv, apatel, Paul Walmsley,
samuel.holland, aou, daniel.lezcano
在 2024-08-15星期四的 10:51 -0700,Palmer Dabbelt写道:
> On Wed, 14 Aug 2024 10:30:48 PDT (-0700), tglx@linutronix.de wrote:
> > On Wed, Aug 14 2024 at 16:56, Emil Renner Berthing wrote:
> > > As described in the thread below[1] I haven't been able to boot
> > > my
> > > boards based on the Allwinner D1 SoC since 6.9 where you
> > > converted the
> > > SiFive PLIC driver to a platform driver.
> > >
> > > This is clearly a regression and there haven't really been much
> > > progress
> > > on fixing the issue since then, so here is the revert that fixes
> > > it.
> > >
> > > If no other fix is found before 6.11 I suggest we apply this.
> >
> > So this mess has been ignored for two month now?
> >
> > > From the pastebin in the initial report:
> >
> > [ 0.000000] irq: no irq domain found for
> > interrupt-controller@10000000 !
> > [ 0.000000] Failed to map interrupt for /soc/timer@2050000
> > [ 0.000000] Failed to initialize '/soc/timer@2050000': -22
> >
> > This comes back with -EINVAL. So the timer cannot find an
> > interrupt,
> > which makes it pretty obvious why the system stops to boot, unless
> > there
> > is some other timer available.
> >
> > This is obviously related to the SUN4I_TIMER because that message
> > went
> > away when it was disabled according to the next pastebin.
> >
> > Obviously that can't work because the SUN4I timer driver is using
> > timer_of_init() which cannot handle deferred probing.
> >
> > Daniel: There was a partial fix for the sun4i driver, which you
> > said you
> > applied:
> >
> >
> > https://lore.kernel.org/all/20240312192519.1602493-1-samuel.holland@sifive.com
> >
> > But that thing never materialized in a pull request.
> >
> > And of course everyone involved ignored the problem since March
> > 13th
> > 2024, i.e. almost half a year.
> >
> > Seriously?
> >
> > Can you RISCV folks get your act together and ensure to fix things
> > you
> > broke on the way? Especially when Emil reported this nobody pointed
> > him
> > to this patch and nobody noticed that it's still not merged?
> >
> > It took me less than 15 minutes to find that patch and the
> > correlation,
> > but this is absolutely not my job.
>
> Sorry, I guess I'd just sort of been ignoring the platform-specific
> side
> of things because it's so frustrating to deal with, but that's led to
> a
> bunch of breakages so it's obviously the wrong thing to do.
>
> > I'm seriously grumpy about that. This is not how it works. If you
> > break
> > stuff, then you take care to fix it before you shove more changes
> > into
> > the tree and waste my time.
> >
> > I'm very much inclined to take the reverts right now, send them to
> > Linus
> > for -rc5 tagged with cc: stable and ignore/nak any irqchip related
> > riscv
> > patches until the next merge window is over.
>
> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
>
> if you want to take the revert.
>
> IIUC the patch above doesn't actually fix it, that's what led to just
> sending the reverts -- at least reverts are better than breaking
> users.
> I'll post over there too...
>
> And it's no big deal if we're in the doghouse for a bit. Regressions
> should get fixed faster than this, so we deserve it.
>
> Probably also another sign we're way too focused on getting new
> features
> merged, as that's coming at the expense of making existing platforms
> work. IMO we've been way too focused on getting support for specs
> that
> don't even have implementations, and not enough on building real
> working
> systems.
Well I think all existing platforms are more or less weird (in
specification-compatibility, stability, etc). (Maybe FU540 isn't so
weird, but it has too few peripherals to be really useful, and it's
discontinued; FU740 has some stability issues.)
>
> > Emil, can you give that sun4i fix a test ride please?
> >
> > Thanks,
> >
> > tglx
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 0/9] Fix Allwinner D1 boot regression
2024-08-14 14:56 [PATCH v1 0/9] Fix Allwinner D1 boot regression Emil Renner Berthing
` (9 preceding siblings ...)
2024-08-14 17:30 ` [PATCH v1 0/9] Fix Allwinner D1 boot regression Thomas Gleixner
@ 2024-08-18 14:47 ` Palmer Dabbelt
10 siblings, 0 replies; 36+ messages in thread
From: Palmer Dabbelt @ 2024-08-18 14:47 UTC (permalink / raw)
To: Renner Berthing
Cc: linux-kernel, linux-riscv, apatel, tglx, Paul Walmsley,
samuel.holland, aou
On Wed, 14 Aug 2024 07:56:32 PDT (-0700), Renner Berthing wrote:
> Hi Anup,
>
> As described in the thread below[1] I haven't been able to boot my
> boards based on the Allwinner D1 SoC since 6.9 where you converted the
> SiFive PLIC driver to a platform driver.
>
> This is clearly a regression and there haven't really been much progress
> on fixing the issue since then, so here is the revert that fixes it.
>
> If no other fix is found before 6.11 I suggest we apply this.
>
> [1]: https://lore.kernel.org/linux-riscv/CAJM55Z9hGKo4784N3s3DhWw=nMRKZKcmvZ58x7uVBghExhoc9A@mail.gmail.com/
>
> /Emil
>
> Emil Renner Berthing (9):
> Revert "irqchip/sifive-plic: Chain to parent IRQ after handlers are
> ready"
> Revert "irqchip/sifive-plic: Avoid explicit cpumask allocation on
> stack"
> Revert "irqchip/sifive-plic: Improve locking safety by using
> irqsave/irqrestore"
> Revert "irqchip/sifive-plic: Parse number of interrupts and contexts
> early in plic_probe()"
> Revert "irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain
> creation failure"
> Revert "irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent
> fwnode"
> Revert "irqchip/sifive-plic: Use devm_xyz() for managed allocation"
> Revert "irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz()"
> Revert "irqchip/sifive-plic: Convert PLIC driver into a platform
> driver"
>
> drivers/irqchip/irq-sifive-plic.c | 296 ++++++++++++------------------
> 1 file changed, 117 insertions(+), 179 deletions(-)
I'm still only testing on the QEMU-emulated platforms, but this isn't
regressing over there so
Tested-by: Palmer Dabbelt <palmer@rivosinc.com> # QEMU
Thanks!
^ permalink raw reply [flat|nested] 36+ messages in thread