public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
@ 2026-03-11  2:51 Troy Mitchell
  2026-03-11  6:47 ` Aurelien Jarno
  2026-03-12  3:05 ` Vivian Wang
  0 siblings, 2 replies; 10+ messages in thread
From: Troy Mitchell @ 2026-03-11  2:51 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti
  Cc: linux-riscv, linux-kernel, Troy Mitchell

Currently, the RISC-V implementation of machine_restart() directly calls
do_kernel_restart() without disabling local interrupts or stopping other
CPUs. This missing architectural setup causes fatal issues for systems
that rely on external peripherals (e.g., I2C PMICs) to execute the system
restart when CONFIG_PREEMPT_RCU is enabled.

When a restart handler relies on the I2C subsystem, the I2C core checks
i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
or the polling atomic_xfer. This check evaluates to true if
(!preemptible() || irqs_disabled()).

During do_kernel_restart(), the restart handlers are invoked via
atomic_notifier_call_chain(), which holds an RCU read lock.
The behavior diverges based on the preemption model:
1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
   implicitly disables preemption. preemptible() evaluates to false, and
   the I2C core correctly routes to the atomic, polling transfer path.
2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
   Since machine_restart() left local interrupts enabled, irqs_disabled()
   is false, and preempt_count is 0. Consequently, preemptible() evaluates
   to true.

As a result, the I2C core falsely assumes a sleepable context and routes
the transfer to the standard master_xfer path. This inevitably triggers a
schedule() call while holding the RCU read lock, resulting in a fatal splat:
"Voluntary context switch within RCU read-side critical section!" and
a system hang.

Align RISC-V with other major architectures (e.g., ARM64) by adding
local_irq_disable() and smp_send_stop() to machine_restart().
- local_irq_disable() guarantees a strict atomic context, forcing sub-
  systems like I2C to always fall back to polling mode.
- smp_send_stop() ensures exclusive hardware access by quiescing other
  CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
  during the final restart phase.

Signed-off-by: Troy Mitchell <troy.mitchell@linux.dev>
---
 arch/riscv/kernel/reset.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
index 912288572226..7a5dcfdc3674 100644
--- a/arch/riscv/kernel/reset.c
+++ b/arch/riscv/kernel/reset.c
@@ -5,6 +5,7 @@
 
 #include <linux/reboot.h>
 #include <linux/pm.h>
+#include <linux/smp.h>
 
 static void default_power_off(void)
 {
@@ -17,6 +18,10 @@ EXPORT_SYMBOL(pm_power_off);
 
 void machine_restart(char *cmd)
 {
+	/* Disable interrupts first */
+	local_irq_disable();
+	smp_send_stop();
+
 	do_kernel_restart(cmd);
 	while (1);
 }

---
base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f
change-id: 20260311-v7-0-rc1-rv-dis-int-before-restart-5b3e52a4b419

Best regards,
-- 
Troy Mitchell <troy.mitchell@linux.dev>


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-11  2:51 [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart Troy Mitchell
@ 2026-03-11  6:47 ` Aurelien Jarno
  2026-03-11  9:49   ` Troy Mitchell
  2026-03-12  3:05 ` Vivian Wang
  1 sibling, 1 reply; 10+ messages in thread
From: Aurelien Jarno @ 2026-03-11  6:47 UTC (permalink / raw)
  To: Troy Mitchell
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	linux-riscv, linux-kernel

Hi Troy,

On 2026-03-11 10:51, Troy Mitchell wrote:
> Currently, the RISC-V implementation of machine_restart() directly calls
> do_kernel_restart() without disabling local interrupts or stopping other
> CPUs. This missing architectural setup causes fatal issues for systems
> that rely on external peripherals (e.g., I2C PMICs) to execute the system
> restart when CONFIG_PREEMPT_RCU is enabled.
> 
> When a restart handler relies on the I2C subsystem, the I2C core checks
> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
> or the polling atomic_xfer. This check evaluates to true if
> (!preemptible() || irqs_disabled()).
> 
> During do_kernel_restart(), the restart handlers are invoked via
> atomic_notifier_call_chain(), which holds an RCU read lock.
> The behavior diverges based on the preemption model:
> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>    implicitly disables preemption. preemptible() evaluates to false, and
>    the I2C core correctly routes to the atomic, polling transfer path.
> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>    to true.
> 
> As a result, the I2C core falsely assumes a sleepable context and routes
> the transfer to the standard master_xfer path. This inevitably triggers a
> schedule() call while holding the RCU read lock, resulting in a fatal splat:
> "Voluntary context switch within RCU read-side critical section!" and
> a system hang.
> 
> Align RISC-V with other major architectures (e.g., ARM64) by adding
> local_irq_disable() and smp_send_stop() to machine_restart().
> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>   systems like I2C to always fall back to polling mode.
> - smp_send_stop() ensures exclusive hardware access by quiescing other
>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>   during the final restart phase.
> 
> Signed-off-by: Troy Mitchell <troy.mitchell@linux.dev>
> ---
>  arch/riscv/kernel/reset.c | 5 +++++
>  1 file changed, 5 insertions(+)

Thanks. I have been debugging that and it matches my analysis.

> diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
> index 912288572226..7a5dcfdc3674 100644
> --- a/arch/riscv/kernel/reset.c
> +++ b/arch/riscv/kernel/reset.c
> @@ -5,6 +5,7 @@
>  
>  #include <linux/reboot.h>
>  #include <linux/pm.h>
> +#include <linux/smp.h>
>  
>  static void default_power_off(void)
>  {
> @@ -17,6 +18,10 @@ EXPORT_SYMBOL(pm_power_off);
>  
>  void machine_restart(char *cmd)
>  {
> +	/* Disable interrupts first */
> +	local_irq_disable();
> +	smp_send_stop();
> +
>  	do_kernel_restart(cmd);
>  	while (1);
>  }
> 

I have started to change the power reset driver to call the I2C code 
from a workqueue instead of directly from the notifier call back, but 
that's just papering over the issue. Your approach is much better and 
aligns riscv64 with other architectures, which is important as we might 
have shared PMIC drivers.

Therefore:

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

Regards
Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                     http://aurel32.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-11  6:47 ` Aurelien Jarno
@ 2026-03-11  9:49   ` Troy Mitchell
  2026-03-11  9:54     ` Troy Mitchell
  0 siblings, 1 reply; 10+ messages in thread
From: Troy Mitchell @ 2026-03-11  9:49 UTC (permalink / raw)
  To: Aurelien Jarno, Troy Mitchell
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	linux-riscv, linux-kernel

On Wed Mar 11, 2026 at 2:47 PM CST, Aurelien Jarno wrote:
> Hi Troy,
>
> On 2026-03-11 10:51, Troy Mitchell wrote:
>> Currently, the RISC-V implementation of machine_restart() directly calls
>> do_kernel_restart() without disabling local interrupts or stopping other
>> CPUs. This missing architectural setup causes fatal issues for systems
>> that rely on external peripherals (e.g., I2C PMICs) to execute the system
>> restart when CONFIG_PREEMPT_RCU is enabled.
>> 
>> When a restart handler relies on the I2C subsystem, the I2C core checks
>> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
>> or the polling atomic_xfer. This check evaluates to true if
>> (!preemptible() || irqs_disabled()).
>> 
>> During do_kernel_restart(), the restart handlers are invoked via
>> atomic_notifier_call_chain(), which holds an RCU read lock.
>> The behavior diverges based on the preemption model:
>> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>>    implicitly disables preemption. preemptible() evaluates to false, and
>>    the I2C core correctly routes to the atomic, polling transfer path.
>> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>>    to true.
>> 
>> As a result, the I2C core falsely assumes a sleepable context and routes
>> the transfer to the standard master_xfer path. This inevitably triggers a
>> schedule() call while holding the RCU read lock, resulting in a fatal splat:
>> "Voluntary context switch within RCU read-side critical section!" and
>> a system hang.
>> 
>> Align RISC-V with other major architectures (e.g., ARM64) by adding
>> local_irq_disable() and smp_send_stop() to machine_restart().
>> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>>   systems like I2C to always fall back to polling mode.
>> - smp_send_stop() ensures exclusive hardware access by quiescing other
>>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>>   during the final restart phase.
>> 
>> Signed-off-by: Troy Mitchell <troy.mitchell@linux.dev>
>> ---
>>  arch/riscv/kernel/reset.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>
> Thanks. I have been debugging that and it matches my analysis.
>
>> diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
>> index 912288572226..7a5dcfdc3674 100644
>> --- a/arch/riscv/kernel/reset.c
>> +++ b/arch/riscv/kernel/reset.c
>> @@ -5,6 +5,7 @@
>>  
>>  #include <linux/reboot.h>
>>  #include <linux/pm.h>
>> +#include <linux/smp.h>
>>  
>>  static void default_power_off(void)
>>  {
>> @@ -17,6 +18,10 @@ EXPORT_SYMBOL(pm_power_off);
>>  
>>  void machine_restart(char *cmd)
>>  {
>> +	/* Disable interrupts first */
>> +	local_irq_disable();
>> +	smp_send_stop();
>> +
>>  	do_kernel_restart(cmd);
>>  	while (1);
>>  }
>> 
>
> I have started to change the power reset driver to call the I2C code 
> from a workqueue instead of directly from the notifier call back, but 
> that's just papering over the issue.
Since the requirements for i2c_in_atomic() weren't being met, I initially
considered disabling interrupts before the p1 restart code.

However, I didn't feel that was a generic enough solution, so I looked into
the architecture-level implementation. That's when I realized how bare-bones
the current RISC-V machine_restart() actually is..
>
> Your approach is much better and 
> aligns riscv64 with other architectures, which is important as we might 
> have shared PMIC drivers.
>
> Therefore:
>
> Tested-by: Aurelien Jarno <aurelien@aurel32.net>
> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Thanks. :)

                                    - Troy
>
> Regards
> Aurelien


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-11  9:49   ` Troy Mitchell
@ 2026-03-11  9:54     ` Troy Mitchell
  0 siblings, 0 replies; 10+ messages in thread
From: Troy Mitchell @ 2026-03-11  9:54 UTC (permalink / raw)
  To: Troy Mitchell, Aurelien Jarno
  Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	linux-riscv, linux-kernel, troy.mitchell

On Wed Mar 11, 2026 at 5:49 PM CST, Troy Mitchell wrote:
> On Wed Mar 11, 2026 at 2:47 PM CST, Aurelien Jarno wrote:
>> Hi Troy,
>>
>> On 2026-03-11 10:51, Troy Mitchell wrote:
>>> Currently, the RISC-V implementation of machine_restart() directly calls
>>> do_kernel_restart() without disabling local interrupts or stopping other
>>> CPUs. This missing architectural setup causes fatal issues for systems
>>> that rely on external peripherals (e.g., I2C PMICs) to execute the system
>>> restart when CONFIG_PREEMPT_RCU is enabled.
>>> 
>>> When a restart handler relies on the I2C subsystem, the I2C core checks
>>> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
>>> or the polling atomic_xfer. This check evaluates to true if
>>> (!preemptible() || irqs_disabled()).
>>> 
>>> During do_kernel_restart(), the restart handlers are invoked via
>>> atomic_notifier_call_chain(), which holds an RCU read lock.
>>> The behavior diverges based on the preemption model:
>>> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>>>    implicitly disables preemption. preemptible() evaluates to false, and
>>>    the I2C core correctly routes to the atomic, polling transfer path.
>>> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>>>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>>>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>>>    to true.
>>> 
>>> As a result, the I2C core falsely assumes a sleepable context and routes
>>> the transfer to the standard master_xfer path. This inevitably triggers a
>>> schedule() call while holding the RCU read lock, resulting in a fatal splat:
>>> "Voluntary context switch within RCU read-side critical section!" and
>>> a system hang.
>>> 
>>> Align RISC-V with other major architectures (e.g., ARM64) by adding
>>> local_irq_disable() and smp_send_stop() to machine_restart().
>>> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>>>   systems like I2C to always fall back to polling mode.
>>> - smp_send_stop() ensures exclusive hardware access by quiescing other
>>>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>>>   during the final restart phase.
>>> 
>>> Signed-off-by: Troy Mitchell <troy.mitchell@linux.dev>
>>> ---
>>>  arch/riscv/kernel/reset.c | 5 +++++
>>>  1 file changed, 5 insertions(+)
>>
>> Thanks. I have been debugging that and it matches my analysis.
>>
>>> diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
>>> index 912288572226..7a5dcfdc3674 100644
>>> --- a/arch/riscv/kernel/reset.c
>>> +++ b/arch/riscv/kernel/reset.c
>>> @@ -5,6 +5,7 @@
>>>  
>>>  #include <linux/reboot.h>
>>>  #include <linux/pm.h>
>>> +#include <linux/smp.h>
>>>  
>>>  static void default_power_off(void)
>>>  {
>>> @@ -17,6 +18,10 @@ EXPORT_SYMBOL(pm_power_off);
>>>  
>>>  void machine_restart(char *cmd)
>>>  {
>>> +	/* Disable interrupts first */
>>> +	local_irq_disable();
>>> +	smp_send_stop();
>>> +
>>>  	do_kernel_restart(cmd);
>>>  	while (1);
>>>  }
>>> 
>>
>> I have started to change the power reset driver to call the I2C code 
>> from a workqueue instead of directly from the notifier call back, but 
>> that's just papering over the issue.
> Since the requirements for i2c_in_atomic() weren't being met, I initially
> considered disabling interrupts before the p1 restart code.
>
> However, I didn't feel that was a generic enough solution, so I looked into
> the architecture-level implementation. That's when I realized how bare-bones
> the current RISC-V machine_restart() actually is..
>
FYI, the discussion with Aurelien started here [1].

Link: https://lore.kernel.org/all/DGZM6WUAVPPS.20Y0NIZYI4572@linux.spacemit.com/ [1]


                          - Troy
>>
>> Your approach is much better and 
>> aligns riscv64 with other architectures, which is important as we might 
>> have shared PMIC drivers.
>>
>> Therefore:
>>
>> Tested-by: Aurelien Jarno <aurelien@aurel32.net>
>> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
> Thanks. :)
>
>                                     - Troy
>>
>> Regards
>> Aurelien


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-11  2:51 [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart Troy Mitchell
  2026-03-11  6:47 ` Aurelien Jarno
@ 2026-03-12  3:05 ` Vivian Wang
  2026-03-16  7:23   ` Troy Mitchell
  1 sibling, 1 reply; 10+ messages in thread
From: Vivian Wang @ 2026-03-12  3:05 UTC (permalink / raw)
  To: Troy Mitchell, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Alexandre Ghiti
  Cc: linux-riscv, linux-kernel

On 3/11/26 10:51, Troy Mitchell wrote:
> Currently, the RISC-V implementation of machine_restart() directly calls
> do_kernel_restart() without disabling local interrupts or stopping other
> CPUs. This missing architectural setup causes fatal issues for systems
> that rely on external peripherals (e.g., I2C PMICs) to execute the system
> restart when CONFIG_PREEMPT_RCU is enabled.
>
> When a restart handler relies on the I2C subsystem, the I2C core checks
> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
> or the polling atomic_xfer. This check evaluates to true if
> (!preemptible() || irqs_disabled()).
>
> During do_kernel_restart(), the restart handlers are invoked via
> atomic_notifier_call_chain(), which holds an RCU read lock.
> The behavior diverges based on the preemption model:
> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>    implicitly disables preemption. preemptible() evaluates to false, and
>    the I2C core correctly routes to the atomic, polling transfer path.
> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>    to true.
>
> As a result, the I2C core falsely assumes a sleepable context and routes
> the transfer to the standard master_xfer path. This inevitably triggers a
> schedule() call while holding the RCU read lock, resulting in a fatal splat:
> "Voluntary context switch within RCU read-side critical section!" and
> a system hang.
>
> Align RISC-V with other major architectures (e.g., ARM64) by adding
> local_irq_disable() and smp_send_stop() to machine_restart().
> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>   systems like I2C to always fall back to polling mode.
> - smp_send_stop() ensures exclusive hardware access by quiescing other
>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>   during the final restart phase.

Maybe while we're at it, we can fix the other functions in this file as
well?

I think the reason we ended up with the "unsafe" implementations of the
reboot/shutdown functions is that on the backend it is usually SBI SRST
calls, which can protect itself from other CPUs and interrupts. Since on
K1 we're going to be poking I2C directly, we run into the problem
described above. So all of these should disable interrupts and stop
other CPUs before calling the handlers, and can't assume the handlers
are all SBI SRST.

> Signed-off-by: Troy Mitchell <troy.mitchell@linux.dev>
> ---
>  arch/riscv/kernel/reset.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
> index 912288572226..7a5dcfdc3674 100644
> --- a/arch/riscv/kernel/reset.c
> +++ b/arch/riscv/kernel/reset.c
> @@ -5,6 +5,7 @@
>  
>  #include <linux/reboot.h>
>  #include <linux/pm.h>
> +#include <linux/smp.h>
>  
>  static void default_power_off(void)
>  {
> @@ -17,6 +18,10 @@ EXPORT_SYMBOL(pm_power_off);
>  
>  void machine_restart(char *cmd)
>  {
> +	/* Disable interrupts first */
> +	local_irq_disable();
> +	smp_send_stop();
> +
>  	do_kernel_restart(cmd);
>  	while (1);
>  }

So... one thing I'm not certain is that arm64 also has some EFI handling
here. But I think the safe choice is to ignore EFI for now until it's
needed, instead of preemptively doing it. Who knows what shenanigans
firmwares can get up to.

Thanks for the patch!

Vivian "dramforever" Wang


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-12  3:05 ` Vivian Wang
@ 2026-03-16  7:23   ` Troy Mitchell
  2026-03-16 13:19     ` Samuel Holland
  0 siblings, 1 reply; 10+ messages in thread
From: Troy Mitchell @ 2026-03-16  7:23 UTC (permalink / raw)
  To: Vivian Wang, Troy Mitchell, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexandre Ghiti
  Cc: linux-riscv, linux-kernel

Hi Vivian,

Thanks for the review and the insights!

On Thu Mar 12, 2026 at 11:05 AM CST, Vivian Wang wrote:
> On 3/11/26 10:51, Troy Mitchell wrote:
>> Currently, the RISC-V implementation of machine_restart() directly calls
>> do_kernel_restart() without disabling local interrupts or stopping other
>> CPUs. This missing architectural setup causes fatal issues for systems
>> that rely on external peripherals (e.g., I2C PMICs) to execute the system
>> restart when CONFIG_PREEMPT_RCU is enabled.
>>
>> When a restart handler relies on the I2C subsystem, the I2C core checks
>> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
>> or the polling atomic_xfer. This check evaluates to true if
>> (!preemptible() || irqs_disabled()).
>>
>> During do_kernel_restart(), the restart handlers are invoked via
>> atomic_notifier_call_chain(), which holds an RCU read lock.
>> The behavior diverges based on the preemption model:
>> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>>    implicitly disables preemption. preemptible() evaluates to false, and
>>    the I2C core correctly routes to the atomic, polling transfer path.
>> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>>    to true.
>>
>> As a result, the I2C core falsely assumes a sleepable context and routes
>> the transfer to the standard master_xfer path. This inevitably triggers a
>> schedule() call while holding the RCU read lock, resulting in a fatal splat:
>> "Voluntary context switch within RCU read-side critical section!" and
>> a system hang.
>>
>> Align RISC-V with other major architectures (e.g., ARM64) by adding
>> local_irq_disable() and smp_send_stop() to machine_restart().
>> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>>   systems like I2C to always fall back to polling mode.
>> - smp_send_stop() ensures exclusive hardware access by quiescing other
>>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>>   during the final restart phase.
>
> Maybe while we're at it, we can fix the other functions in this file as
> well?
Nice catch. I'll fix other functions in the next version.
>
> I think the reason we ended up with the "unsafe" implementations of the
> reboot/shutdown functions is that on the backend it is usually SBI SRST
> calls, which can protect itself from other CPUs and interrupts. Since on
> K1 we're going to be poking I2C directly, we run into the problem
> described above. So all of these should disable interrupts and stop
> other CPUs before calling the handlers, and can't assume the handlers
> are all SBI SRST.
Yes, we cannot assume that all platforms rely on this.
>
>> Signed-off-by: Troy Mitchell <troy.mitchell@linux.dev>
>> ---
>>  arch/riscv/kernel/reset.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
>> index 912288572226..7a5dcfdc3674 100644
>> --- a/arch/riscv/kernel/reset.c
>> +++ b/arch/riscv/kernel/reset.c
>> @@ -5,6 +5,7 @@
>>  
>>  #include <linux/reboot.h>
>>  #include <linux/pm.h>
>> +#include <linux/smp.h>
>>  
>>  static void default_power_off(void)
>>  {
>> @@ -17,6 +18,10 @@ EXPORT_SYMBOL(pm_power_off);
>>  
>>  void machine_restart(char *cmd)
>>  {
>> +	/* Disable interrupts first */
>> +	local_irq_disable();
>> +	smp_send_stop();
>> +
>>  	do_kernel_restart(cmd);
>>  	while (1);
>>  }
>
> So... one thing I'm not certain is that arm64 also has some EFI handling
> here. But I think the safe choice is to ignore EFI for now until it's
> needed, instead of preemptively doing it. Who knows what shenanigans
> firmwares can get up to.
I agree at all.

>
> Thanks for the patch!
I'll prepare v2 with those functions fixed. Looking forward to your further review
on the updated version

                                                - Troy
>
> Vivian "dramforever" Wang


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-16  7:23   ` Troy Mitchell
@ 2026-03-16 13:19     ` Samuel Holland
  2026-03-17  2:45       ` Troy Mitchell
  0 siblings, 1 reply; 10+ messages in thread
From: Samuel Holland @ 2026-03-16 13:19 UTC (permalink / raw)
  To: Troy Mitchell, Vivian Wang, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexandre Ghiti
  Cc: linux-riscv, linux-kernel

Hi Troy,

On 2026-03-16 2:23 AM, Troy Mitchell wrote:
> On Thu Mar 12, 2026 at 11:05 AM CST, Vivian Wang wrote:
>> On 3/11/26 10:51, Troy Mitchell wrote:
>>> Currently, the RISC-V implementation of machine_restart() directly calls
>>> do_kernel_restart() without disabling local interrupts or stopping other
>>> CPUs. This missing architectural setup causes fatal issues for systems
>>> that rely on external peripherals (e.g., I2C PMICs) to execute the system
>>> restart when CONFIG_PREEMPT_RCU is enabled.
>>>
>>> When a restart handler relies on the I2C subsystem, the I2C core checks
>>> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
>>> or the polling atomic_xfer. This check evaluates to true if
>>> (!preemptible() || irqs_disabled()).
>>>
>>> During do_kernel_restart(), the restart handlers are invoked via
>>> atomic_notifier_call_chain(), which holds an RCU read lock.
>>> The behavior diverges based on the preemption model:
>>> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>>>    implicitly disables preemption. preemptible() evaluates to false, and
>>>    the I2C core correctly routes to the atomic, polling transfer path.
>>> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>>>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>>>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>>>    to true.
>>>
>>> As a result, the I2C core falsely assumes a sleepable context and routes
>>> the transfer to the standard master_xfer path. This inevitably triggers a
>>> schedule() call while holding the RCU read lock, resulting in a fatal splat:
>>> "Voluntary context switch within RCU read-side critical section!" and
>>> a system hang.
>>>
>>> Align RISC-V with other major architectures (e.g., ARM64) by adding
>>> local_irq_disable() and smp_send_stop() to machine_restart().
>>> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>>>   systems like I2C to always fall back to polling mode.
>>> - smp_send_stop() ensures exclusive hardware access by quiescing other
>>>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>>>   during the final restart phase.
>>
>> Maybe while we're at it, we can fix the other functions in this file as
>> well?
> Nice catch. I'll fix other functions in the next version.
>>
>> I think the reason we ended up with the "unsafe" implementations of the
>> reboot/shutdown functions is that on the backend it is usually SBI SRST
>> calls, which can protect itself from other CPUs and interrupts. Since on
>> K1 we're going to be poking I2C directly, we run into the problem
>> described above. So all of these should disable interrupts and stop
>> other CPUs before calling the handlers, and can't assume the handlers
>> are all SBI SRST.
> Yes, we cannot assume that all platforms rely on this.

Why isn't K1 using the SBI SRST extension? Resetting the platform from S-mode
directly causes problems if you ever want to run another supervisor domain (for
example a TEE or EFI runtime services), which may need to clean up before a
system reset.

As you mention, other platforms use the standard SBI SRST interface, event if
they need to poke a PMIC to perform a system reset. Is there something
preventing K1 from following this path?

Regards,
Samuel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-16 13:19     ` Samuel Holland
@ 2026-03-17  2:45       ` Troy Mitchell
  2026-03-17  4:07         ` Samuel Holland
  0 siblings, 1 reply; 10+ messages in thread
From: Troy Mitchell @ 2026-03-17  2:45 UTC (permalink / raw)
  To: Samuel Holland, Troy Mitchell, Vivian Wang, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Alexandre Ghiti
  Cc: linux-riscv, linux-kernel, spacemit

On Mon Mar 16, 2026 at 9:19 PM CST, Samuel Holland wrote:
Hi Samuel,
> Hi Troy,
>
> On 2026-03-16 2:23 AM, Troy Mitchell wrote:
>> On Thu Mar 12, 2026 at 11:05 AM CST, Vivian Wang wrote:
>>> On 3/11/26 10:51, Troy Mitchell wrote:
>>>> Currently, the RISC-V implementation of machine_restart() directly calls
>>>> do_kernel_restart() without disabling local interrupts or stopping other
>>>> CPUs. This missing architectural setup causes fatal issues for systems
>>>> that rely on external peripherals (e.g., I2C PMICs) to execute the system
>>>> restart when CONFIG_PREEMPT_RCU is enabled.
>>>>
>>>> When a restart handler relies on the I2C subsystem, the I2C core checks
>>>> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
>>>> or the polling atomic_xfer. This check evaluates to true if
>>>> (!preemptible() || irqs_disabled()).
>>>>
>>>> During do_kernel_restart(), the restart handlers are invoked via
>>>> atomic_notifier_call_chain(), which holds an RCU read lock.
>>>> The behavior diverges based on the preemption model:
>>>> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>>>>    implicitly disables preemption. preemptible() evaluates to false, and
>>>>    the I2C core correctly routes to the atomic, polling transfer path.
>>>> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>>>>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>>>>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>>>>    to true.
>>>>
>>>> As a result, the I2C core falsely assumes a sleepable context and routes
>>>> the transfer to the standard master_xfer path. This inevitably triggers a
>>>> schedule() call while holding the RCU read lock, resulting in a fatal splat:
>>>> "Voluntary context switch within RCU read-side critical section!" and
>>>> a system hang.
>>>>
>>>> Align RISC-V with other major architectures (e.g., ARM64) by adding
>>>> local_irq_disable() and smp_send_stop() to machine_restart().
>>>> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>>>>   systems like I2C to always fall back to polling mode.
>>>> - smp_send_stop() ensures exclusive hardware access by quiescing other
>>>>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>>>>   during the final restart phase.
>>>
>>> Maybe while we're at it, we can fix the other functions in this file as
>>> well?
>> Nice catch. I'll fix other functions in the next version.
>>>
>>> I think the reason we ended up with the "unsafe" implementations of the
>>> reboot/shutdown functions is that on the backend it is usually SBI SRST
>>> calls, which can protect itself from other CPUs and interrupts. Since on
>>> K1 we're going to be poking I2C directly, we run into the problem
>>> described above. So all of these should disable interrupts and stop
>>> other CPUs before calling the handlers, and can't assume the handlers
>>> are all SBI SRST.
>> Yes, we cannot assume that all platforms rely on this.
>
> Why isn't K1 using the SBI SRST extension? Resetting the platform from S-mode
> directly causes problems if you ever want to run another supervisor domain (for
> example a TEE or EFI runtime services), which may need to clean up before a
> system reset.
>
> As you mention, other platforms use the standard SBI SRST interface, event if
> they need to poke a PMIC to perform a system reset. Is there something
> preventing K1 from following this path?

Ideally, yes, resetting the platform should be handled by the firmware (SBI SRST) to
properly tear down M-mode/TEE states before pulling the plug.

However, for the K1 platform, while the SoC itself can be reset via SBI SRST, this warm
reset does not propagate to the external PMIC. For instance, if the kernel switches the
SD card to 1.8V signaling for high-speed mode during runtime, an SoC-only reset leaves
the PMIC still supplying 1.8V. Upon reboot, the bootrom/early kernel expects the SD card
to be in the default 3.3V state for initialization, which inevitably leads to a boot hang.

Setting the K1-specific hardware design aside, I believe this patch is fundamentally necessary
to fix a latent bug in the RISC-V Linux kernel itself, regardless of whether the final
reset is backed by SBI or not.

The core issue here is the context in which do_kernel_restart() invokes the restart_handler_list.
It does so via an atomic_notifier_call_chain. If RISC-V enters this phase without disabling local IRQs,
we are leaving a trap for any generic driver or kernel subsystem invoked during this teardown phase.

Under CONFIG_PREEMPT_RCU, the RCU read lock does not bump the preempt count.
Without local_irq_disable(), the preemptible() check will unexpectedly evaluate to true.

A quick grep shows that preemptible() is widely relied upon by generic subsystems to
determine safe execution paths (~16 occurrences in kernel/ and ~11 in drivers/).
This means generic code implicitly trusts the architecture to set up the correct
atomic context during a system restart.

The I2C PMIC crash we encountered is just one symptom of this missing context. If we don't
fix this at the architecture level, it leaves the door open for other undiscovered panics.
Any driver (watchdog, display, network, etc.) that registers a restart handler and internally
relies on preemptible() to choose between a sleepable or polling path will inevitably trigger
a schedule(), attempt to acquire a sleeping lock (e.g., a mutex), or call other blocking functions
while holding the RCU read lock, resulting in a fatal splat.

*Aligning machine_restart() with other architectures* by adding local_irq_disable() and smp_send_stop()
ensures a deterministic, single-threaded atomic context.
This protects the Linux teardown sequence from SMP deadlocks and context misidentification,
well before the control is ever handed over to the firmware or hardware.

Does this perspective make sense?

                      - Troy
>
> Regards,
> Samuel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-17  2:45       ` Troy Mitchell
@ 2026-03-17  4:07         ` Samuel Holland
  2026-03-17  7:42           ` Troy Mitchell
  0 siblings, 1 reply; 10+ messages in thread
From: Samuel Holland @ 2026-03-17  4:07 UTC (permalink / raw)
  To: Troy Mitchell, Vivian Wang, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexandre Ghiti
  Cc: linux-riscv, linux-kernel, spacemit

Hi Troy,

On 2026-03-16 9:45 PM, Troy Mitchell wrote:
> On Mon Mar 16, 2026 at 9:19 PM CST, Samuel Holland wrote:
>> On 2026-03-16 2:23 AM, Troy Mitchell wrote:
>>> On Thu Mar 12, 2026 at 11:05 AM CST, Vivian Wang wrote:
>>>> On 3/11/26 10:51, Troy Mitchell wrote:
>>>>> Currently, the RISC-V implementation of machine_restart() directly calls
>>>>> do_kernel_restart() without disabling local interrupts or stopping other
>>>>> CPUs. This missing architectural setup causes fatal issues for systems
>>>>> that rely on external peripherals (e.g., I2C PMICs) to execute the system
>>>>> restart when CONFIG_PREEMPT_RCU is enabled.
>>>>>
>>>>> When a restart handler relies on the I2C subsystem, the I2C core checks
>>>>> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
>>>>> or the polling atomic_xfer. This check evaluates to true if
>>>>> (!preemptible() || irqs_disabled()).
>>>>>
>>>>> During do_kernel_restart(), the restart handlers are invoked via
>>>>> atomic_notifier_call_chain(), which holds an RCU read lock.
>>>>> The behavior diverges based on the preemption model:
>>>>> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>>>>>    implicitly disables preemption. preemptible() evaluates to false, and
>>>>>    the I2C core correctly routes to the atomic, polling transfer path.
>>>>> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>>>>>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>>>>>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>>>>>    to true.
>>>>>
>>>>> As a result, the I2C core falsely assumes a sleepable context and routes
>>>>> the transfer to the standard master_xfer path. This inevitably triggers a
>>>>> schedule() call while holding the RCU read lock, resulting in a fatal splat:
>>>>> "Voluntary context switch within RCU read-side critical section!" and
>>>>> a system hang.
>>>>>
>>>>> Align RISC-V with other major architectures (e.g., ARM64) by adding
>>>>> local_irq_disable() and smp_send_stop() to machine_restart().
>>>>> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>>>>>   systems like I2C to always fall back to polling mode.
>>>>> - smp_send_stop() ensures exclusive hardware access by quiescing other
>>>>>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>>>>>   during the final restart phase.
>>>>
>>>> Maybe while we're at it, we can fix the other functions in this file as
>>>> well?
>>> Nice catch. I'll fix other functions in the next version.
>>>>
>>>> I think the reason we ended up with the "unsafe" implementations of the
>>>> reboot/shutdown functions is that on the backend it is usually SBI SRST
>>>> calls, which can protect itself from other CPUs and interrupts. Since on
>>>> K1 we're going to be poking I2C directly, we run into the problem
>>>> described above. So all of these should disable interrupts and stop
>>>> other CPUs before calling the handlers, and can't assume the handlers
>>>> are all SBI SRST.
>>> Yes, we cannot assume that all platforms rely on this.
>>
>> Why isn't K1 using the SBI SRST extension? Resetting the platform from S-mode
>> directly causes problems if you ever want to run another supervisor domain (for
>> example a TEE or EFI runtime services), which may need to clean up before a
>> system reset.
>>
>> As you mention, other platforms use the standard SBI SRST interface, event if
>> they need to poke a PMIC to perform a system reset. Is there something
>> preventing K1 from following this path?
> 
> Ideally, yes, resetting the platform should be handled by the firmware (SBI SRST) to
> properly tear down M-mode/TEE states before pulling the plug.
> 
> However, for the K1 platform, while the SoC itself can be reset via SBI SRST, this warm
> reset does not propagate to the external PMIC. For instance, if the kernel switches the
> SD card to 1.8V signaling for high-speed mode during runtime, an SoC-only reset leaves
> the PMIC still supplying 1.8V. Upon reboot, the bootrom/early kernel expects the SD card
> to be in the default 3.3V state for initialization, which inevitably leads to a boot hang.

Right, so you would need a driver in M-mode that performs a PMIC-level reset
instead of a SoC-level reset when the SBI SRST function is called. Several
platforms already do this.

> Setting the K1-specific hardware design aside, I believe this patch is fundamentally necessary
> to fix a latent bug in the RISC-V Linux kernel itself, regardless of whether the final
> reset is backed by SBI or not.
> 
> The core issue here is the context in which do_kernel_restart() invokes the restart_handler_list.
> It does so via an atomic_notifier_call_chain. If RISC-V enters this phase without disabling local IRQs,
> we are leaving a trap for any generic driver or kernel subsystem invoked during this teardown phase.
> 
> Under CONFIG_PREEMPT_RCU, the RCU read lock does not bump the preempt count.
> Without local_irq_disable(), the preemptible() check will unexpectedly evaluate to true.
> 
> A quick grep shows that preemptible() is widely relied upon by generic subsystems to
> determine safe execution paths (~16 occurrences in kernel/ and ~11 in drivers/).
> This means generic code implicitly trusts the architecture to set up the correct
> atomic context during a system restart.

I agree with the conceptual issue, that do_kernel_restart() should be called
from an atomic context, but may not be. But there seems to be no practical
problem if the first/only entry in restart_handler_list is sbi_srst_reboot(),
which is why we haven't seen problems on other platforms.

> The I2C PMIC crash we encountered is just one symptom of this missing context. If we don't
> fix this at the architecture level, it leaves the door open for other undiscovered panics.
> Any driver (watchdog, display, network, etc.) that registers a restart handler and internally
> relies on preemptible() to choose between a sleepable or polling path will inevitably trigger
> a schedule(), attempt to acquire a sleeping lock (e.g., a mutex), or call other blocking functions
> while holding the RCU read lock, resulting in a fatal splat.

Are you possibly confusing restart_handler_list and reboot_notifier_list here? A
display or network driver would not register a restart handler. And similarly to
the PMIC case, I would expect a platform that resets by watchdog to have a
M-mode driver for it and still use the SBI SRST extension.

> *Aligning machine_restart() with other architectures* by adding local_irq_disable() and smp_send_stop()
> ensures a deterministic, single-threaded atomic context.
> This protects the Linux teardown sequence from SMP deadlocks and context misidentification,
> well before the control is ever handed over to the firmware or hardware.
> 
> Does this perspective make sense?

Yes, I'm not opposed to making this change, because it is conceptually the right
thing to do. I'm only questioning why you are in a position to notice this issue
in the first place: you only see this because Linux is driving the PMIC directly
instead of going through firmware.

Regards,
Samuel


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart
  2026-03-17  4:07         ` Samuel Holland
@ 2026-03-17  7:42           ` Troy Mitchell
  0 siblings, 0 replies; 10+ messages in thread
From: Troy Mitchell @ 2026-03-17  7:42 UTC (permalink / raw)
  To: Samuel Holland, Troy Mitchell, Vivian Wang, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Alexandre Ghiti
  Cc: linux-riscv, linux-kernel, spacemit

Hi Samuel,

On Tue Mar 17, 2026 at 12:07 PM CST, Samuel Holland wrote:
> On 2026-03-16 9:45 PM, Troy Mitchell wrote:
>> On Mon Mar 16, 2026 at 9:19 PM CST, Samuel Holland wrote:
>>> On 2026-03-16 2:23 AM, Troy Mitchell wrote:
>>>> On Thu Mar 12, 2026 at 11:05 AM CST, Vivian Wang wrote:
>>>>> On 3/11/26 10:51, Troy Mitchell wrote:
>>>>>> Currently, the RISC-V implementation of machine_restart() directly calls
>>>>>> do_kernel_restart() without disabling local interrupts or stopping other
>>>>>> CPUs. This missing architectural setup causes fatal issues for systems
>>>>>> that rely on external peripherals (e.g., I2C PMICs) to execute the system
>>>>>> restart when CONFIG_PREEMPT_RCU is enabled.
>>>>>>
>>>>>> When a restart handler relies on the I2C subsystem, the I2C core checks
>>>>>> i2c_in_atomic_xfer_mode() to decide whether to use the sleepable xfer
>>>>>> or the polling atomic_xfer. This check evaluates to true if
>>>>>> (!preemptible() || irqs_disabled()).
>>>>>>
>>>>>> During do_kernel_restart(), the restart handlers are invoked via
>>>>>> atomic_notifier_call_chain(), which holds an RCU read lock.
>>>>>> The behavior diverges based on the preemption model:
>>>>>> 1. Under CONFIG_PREEMPT_VOLUNTARY or CONFIG_PREEMPT_NONE, rcu_read_lock()
>>>>>>    implicitly disables preemption. preemptible() evaluates to false, and
>>>>>>    the I2C core correctly routes to the atomic, polling transfer path.
>>>>>> 2. Under CONFIG_PREEMPT_RCU, rcu_read_lock() does NOT disable preemption.
>>>>>>    Since machine_restart() left local interrupts enabled, irqs_disabled()
>>>>>>    is false, and preempt_count is 0. Consequently, preemptible() evaluates
>>>>>>    to true.
>>>>>>
>>>>>> As a result, the I2C core falsely assumes a sleepable context and routes
>>>>>> the transfer to the standard master_xfer path. This inevitably triggers a
>>>>>> schedule() call while holding the RCU read lock, resulting in a fatal splat:
>>>>>> "Voluntary context switch within RCU read-side critical section!" and
>>>>>> a system hang.
>>>>>>
>>>>>> Align RISC-V with other major architectures (e.g., ARM64) by adding
>>>>>> local_irq_disable() and smp_send_stop() to machine_restart().
>>>>>> - local_irq_disable() guarantees a strict atomic context, forcing sub-
>>>>>>   systems like I2C to always fall back to polling mode.
>>>>>> - smp_send_stop() ensures exclusive hardware access by quiescing other
>>>>>>   CPUs, preventing them from holding bus locks (e.g., I2C spinlocks)
>>>>>>   during the final restart phase.
>>>>>
>>>>> Maybe while we're at it, we can fix the other functions in this file as
>>>>> well?
>>>> Nice catch. I'll fix other functions in the next version.
>>>>>
>>>>> I think the reason we ended up with the "unsafe" implementations of the
>>>>> reboot/shutdown functions is that on the backend it is usually SBI SRST
>>>>> calls, which can protect itself from other CPUs and interrupts. Since on
>>>>> K1 we're going to be poking I2C directly, we run into the problem
>>>>> described above. So all of these should disable interrupts and stop
>>>>> other CPUs before calling the handlers, and can't assume the handlers
>>>>> are all SBI SRST.
>>>> Yes, we cannot assume that all platforms rely on this.
>>>
>>> Why isn't K1 using the SBI SRST extension? Resetting the platform from S-mode
>>> directly causes problems if you ever want to run another supervisor domain (for
>>> example a TEE or EFI runtime services), which may need to clean up before a
>>> system reset.
>>>
>>> As you mention, other platforms use the standard SBI SRST interface, event if
>>> they need to poke a PMIC to perform a system reset. Is there something
>>> preventing K1 from following this path?
>> 
>> Ideally, yes, resetting the platform should be handled by the firmware (SBI SRST) to
>> properly tear down M-mode/TEE states before pulling the plug.
>> 
>> However, for the K1 platform, while the SoC itself can be reset via SBI SRST, this warm
>> reset does not propagate to the external PMIC. For instance, if the kernel switches the
>> SD card to 1.8V signaling for high-speed mode during runtime, an SoC-only reset leaves
>> the PMIC still supplying 1.8V. Upon reboot, the bootrom/early kernel expects the SD card
>> to be in the default 3.3V state for initialization, which inevitably leads to a boot hang.
>
> Right, so you would need a driver in M-mode that performs a PMIC-level reset
> instead of a SoC-level reset when the SBI SRST function is called. Several
> platforms already do this.
Agreed. Porting the PMIC/I2C driver to M-mode (OpenSBI) is indeed the proper
long-term solution for the K1 platform.
>
>> Setting the K1-specific hardware design aside, I believe this patch is fundamentally necessary
>> to fix a latent bug in the RISC-V Linux kernel itself, regardless of whether the final
>> reset is backed by SBI or not.
>> 
>> The core issue here is the context in which do_kernel_restart() invokes the restart_handler_list.
>> It does so via an atomic_notifier_call_chain. If RISC-V enters this phase without disabling local IRQs,
>> we are leaving a trap for any generic driver or kernel subsystem invoked during this teardown phase.
>> 
>> Under CONFIG_PREEMPT_RCU, the RCU read lock does not bump the preempt count.
>> Without local_irq_disable(), the preemptible() check will unexpectedly evaluate to true.
>> 
>> A quick grep shows that preemptible() is widely relied upon by generic subsystems to
>> determine safe execution paths (~16 occurrences in kernel/ and ~11 in drivers/).
>> This means generic code implicitly trusts the architecture to set up the correct
>> atomic context during a system restart.
>
> I agree with the conceptual issue, that do_kernel_restart() should be called
> from an atomic context, but may not be. But there seems to be no practical
> problem if the first/only entry in restart_handler_list is sbi_srst_reboot(),
> which is why we haven't seen problems on other platforms.
Yes, that perfectly explains why this missing atomic context has remained
hidden on most other platforms.

>
>> The I2C PMIC crash we encountered is just one symptom of this missing context. If we don't
>> fix this at the architecture level, it leaves the door open for other undiscovered panics.
>> Any driver (watchdog, display, network, etc.) that registers a restart handler and internally
>> relies on preemptible() to choose between a sleepable or polling path will inevitably trigger
>> a schedule(), attempt to acquire a sleeping lock (e.g., a mutex), or call other blocking functions
>> while holding the RCU read lock, resulting in a fatal splat.
>
> Are you possibly confusing restart_handler_list and reboot_notifier_list here? A
> display or network driver would not register a restart handler. And similarly to
> the PMIC case, I would expect a platform that resets by watchdog to have a
> M-mode driver for it and still use the SBI SRST extension.
You are completely right, I mixed up `restart_handler_list` and
`reboot_notifier_list`. My apologies for the poor examples. Display and network
drivers definitely use the notifiers. The actual restart handlers are mostly
limited to watchdogs, PMICs, or specific reset controllers.

Thank you for pointing this out.
>
>> *Aligning machine_restart() with other architectures* by adding local_irq_disable() and smp_send_stop()
>> ensures a deterministic, single-threaded atomic context.
>> This protects the Linux teardown sequence from SMP deadlocks and context misidentification,
>> well before the control is ever handed over to the firmware or hardware.
>> 
>> Does this perspective make sense?
>
> Yes, I'm not opposed to making this change, because it is conceptually the right
> thing to do. I'm only questioning why you are in a position to notice this issue
> in the first place: you only see this because Linux is driving the PMIC directly
> instead of going through firmware.
Thanks for the ACK and the insightful review.

I will prepare and send out a v2 shortly.
Looking forward to your further review.

                    - Troy

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-03-17  7:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-11  2:51 [PATCH RFC] riscv: disable local interrupts and stop other CPUs before restart Troy Mitchell
2026-03-11  6:47 ` Aurelien Jarno
2026-03-11  9:49   ` Troy Mitchell
2026-03-11  9:54     ` Troy Mitchell
2026-03-12  3:05 ` Vivian Wang
2026-03-16  7:23   ` Troy Mitchell
2026-03-16 13:19     ` Samuel Holland
2026-03-17  2:45       ` Troy Mitchell
2026-03-17  4:07         ` Samuel Holland
2026-03-17  7:42           ` Troy Mitchell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox