All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] genirq/chip: Don't call add_interrupt_randomness() for NMIs
@ 2026-05-07 11:05 Mark Rutland
  2026-05-07 11:38 ` Jinjie Ruan
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Mark Rutland @ 2026-05-07 11:05 UTC (permalink / raw)
  To: linux-kernel, Thomas Gleixner
  Cc: ada.coupriediaz, justin.he, mark.rutland, maz, mhklinux,
	vladimir.murzin

Recently handle_percpu_devid_irq() was changed to call
add_interrupt_randomness(). This introduced a potential deadlock when
handle_percpu_devid_irq() is used to handle an NMI, which can be
detected with lockdep, e.g.

    ================================
    WARNING: inconsistent lock state
    7.1.0-rc2-pnmi #465 Not tainted
    --------------------------------
    inconsistent {INITIAL USE} -> {IN-NMI} usage.
    perf/695 [HC1[1]:SC0[0]:HE0:SE1] takes:
    ffff00837dfd3a18 (&base->lock){-.-.}-{2:2}, at: lock_timer_base+0x6c/0xac
    {INITIAL USE} state was registered at:
      lock_acquire+0x260/0x40c
      _raw_spin_lock_irqsave+0x68/0xb0
      lock_timer_base+0x6c/0xac
      __mod_timer+0x100/0x32c
      add_timer_global+0x2c/0x40
      __queue_delayed_work+0xf0/0x140
      queue_delayed_work_on+0x134/0x138
      mem_cgroup_css_online+0x30c/0x310
      online_css+0x34/0x10c
      cgroup_init_subsys+0x158/0x1c8
      cgroup_init+0x440/0x524
      start_kernel+0x888/0x998
      __primary_switched+0x88/0x90
    irq event stamp: 62068
    hardirqs last  enabled at (62067): [<ffff8000801cc0ec>] seqcount_lockdep_reader_access.constprop.0+0xf0/0xfc
    hardirqs last disabled at (62068): [<ffff80008150e0ac>] _raw_spin_lock_irqsave+0x94/0xb0
    softirqs last  enabled at (62050): [<ffff800080017614>] put_cpu_fpsimd_context+0x1c/0x4c
    softirqs last disabled at (62048): [<ffff8000800175c8>] get_cpu_fpsimd_context+0x1c/0x4c
        other info that might help us debug this:
    Possible unsafe locking scenario:
           CPU0
           ----
      lock(&base->lock);
      <Interrupt>
        lock(&base->lock);
        *** DEADLOCK ***
    3 locks held by perf/695:
     #0: ffff0080146cd2c8 (&type->i_mutex_dir_key#6){++++}-{4:4}, at: lookup_slow+0x30/0x68
     #1: ffff80008332b858 (rcu_read_lock){....}-{1:3}, at: blk_mq_run_hw_queue+0xf4/0x1fc
     #2: ffff008000b6aa18 (&host->lock){-.-.}-{3:3}, at: ata_scsi_queuecmd+0x28/0x88
    stack backtrace:
    CPU: 3 UID: 0 PID: 695 Comm: perf Not tainted 7.1.0-rc2-pnmi #465 PREEMPT
    Hardware name: ARM LTD Morello System Development Platform, BIOS EDK II Mar  7 2024
    Call trace:
     show_stack+0x18/0x24 (C)
     dump_stack_lvl+0xc4/0x148
     dump_stack+0x18/0x24
     print_usage_bug.part.0+0x29c/0x364
     lock_acquire+0x364/0x40c
     _raw_spin_lock_irqsave+0x68/0xb0
     lock_timer_base+0x6c/0xac
     add_timer_on+0x78/0x16c
     add_interrupt_randomness+0x124/0x134
     handle_percpu_devid_irq+0xd4/0x16c
     handle_irq_desc+0x40/0x58
     generic_handle_domain_nmi+0x28/0x50
     __gic_handle_nmi.isra.0+0x4c/0xa0
     gic_handle_irq+0x38/0x2bc
     call_on_irq_stack+0x30/0x48
     do_interrupt_handler+0x80/0x98
     el1_interrupt+0x90/0xac
     el1h_64_irq_handler+0x18/0x24
     el1h_64_irq+0x80/0x84
     [...]

During review, Thomas pointed out it wouldn't be safe for
handle_percpu_devid_irq() to call add_interrupt_randomness() if it was
used to handle NMIs:

  https://lore.kernel.org/lkml/87bjgik042.ffs@tglx/

... but evidently people missed that handle_percpu_devid_irq() *is* used
for NMIs.

While it might seem that we should handle NMIs with a separate
handle_percpu_devid_nmi() function, for various structural reasons this
was impractical, and handle_percpu_devid_irq() has been expected to be
used for NMIs since commits:

  21bbbc50f398f ("irqchip/gic-v3: Switch high priority PPIs over to handle_percpu_devid_irq()")
  5ff78c8de9d83 ("genirq: Kill handle_percpu_devid_fasteoi_nmi()")

Taking the above into account, avoid the deadlock by not calling
add_interrupt_randomness() when handle_percpu_devid_irq() is called in
an NMI context. This is consistent with our other NNI handling flows,
which do not call add_interrupt_randomness().

At the same time, update the kerneldoc comment to make it clear that
handle_percpu_devid_irq() can be called in NMI context. The rest of
handle_percpu_devid_irq() is currently NMI safe and doesn't need to
change.

Fixes: fd7400cfcbaa ("genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()")
Reported-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Justin He <justin.he@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
---
 kernel/irq/chip.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6c9b1dc4e7d46..b635e3c5d5b6b 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -14,6 +14,7 @@
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
 #include <linux/irqdomain.h>
+#include <linux/preempt.h>
 #include <linux/random.h>
 
 #include <trace/events/irq.h>
@@ -893,7 +894,10 @@ void handle_percpu_irq(struct irq_desc *desc)
  *
  * action->percpu_dev_id is a pointer to percpu variables which
  * contain the real device id for the cpu on which this handler is
- * called
+ * called.
+ *
+ * May be used for NMI interrupt lines, and so may be called in IRQ or NMI
+ * context.
  */
 void handle_percpu_devid_irq(struct irq_desc *desc)
 {
@@ -930,7 +934,8 @@ void handle_percpu_devid_irq(struct irq_desc *desc)
 			    enabled ? " and unmasked" : "", irq, cpu);
 	}
 
-	add_interrupt_randomness(irq);
+	if (!in_nmi())
+		add_interrupt_randomness(irq);
 
 	if (chip->irq_eoi)
 		chip->irq_eoi(&desc->irq_data);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] genirq/chip: Don't call add_interrupt_randomness() for NMIs
  2026-05-07 11:05 [PATCH] genirq/chip: Don't call add_interrupt_randomness() for NMIs Mark Rutland
@ 2026-05-07 11:38 ` Jinjie Ruan
  2026-05-07 11:48 ` Marc Zyngier
  2026-05-11 13:00 ` [tip: irq/urgent] " tip-bot2 for Mark Rutland
  2 siblings, 0 replies; 4+ messages in thread
From: Jinjie Ruan @ 2026-05-07 11:38 UTC (permalink / raw)
  To: Mark Rutland, linux-kernel, Thomas Gleixner
  Cc: ada.coupriediaz, justin.he, maz, mhklinux, vladimir.murzin



On 5/7/2026 7:05 PM, Mark Rutland wrote:
> Recently handle_percpu_devid_irq() was changed to call
> add_interrupt_randomness(). This introduced a potential deadlock when
> handle_percpu_devid_irq() is used to handle an NMI, which can be
> detected with lockdep, e.g.
> 
>     ================================
>     WARNING: inconsistent lock state
>     7.1.0-rc2-pnmi #465 Not tainted
>     --------------------------------
>     inconsistent {INITIAL USE} -> {IN-NMI} usage.
>     perf/695 [HC1[1]:SC0[0]:HE0:SE1] takes:
>     ffff00837dfd3a18 (&base->lock){-.-.}-{2:2}, at: lock_timer_base+0x6c/0xac
>     {INITIAL USE} state was registered at:
>       lock_acquire+0x260/0x40c
>       _raw_spin_lock_irqsave+0x68/0xb0
>       lock_timer_base+0x6c/0xac
>       __mod_timer+0x100/0x32c
>       add_timer_global+0x2c/0x40
>       __queue_delayed_work+0xf0/0x140
>       queue_delayed_work_on+0x134/0x138
>       mem_cgroup_css_online+0x30c/0x310
>       online_css+0x34/0x10c
>       cgroup_init_subsys+0x158/0x1c8
>       cgroup_init+0x440/0x524
>       start_kernel+0x888/0x998
>       __primary_switched+0x88/0x90
>     irq event stamp: 62068
>     hardirqs last  enabled at (62067): [<ffff8000801cc0ec>] seqcount_lockdep_reader_access.constprop.0+0xf0/0xfc
>     hardirqs last disabled at (62068): [<ffff80008150e0ac>] _raw_spin_lock_irqsave+0x94/0xb0
>     softirqs last  enabled at (62050): [<ffff800080017614>] put_cpu_fpsimd_context+0x1c/0x4c
>     softirqs last disabled at (62048): [<ffff8000800175c8>] get_cpu_fpsimd_context+0x1c/0x4c
>         other info that might help us debug this:
>     Possible unsafe locking scenario:
>            CPU0
>            ----
>       lock(&base->lock);
>       <Interrupt>
>         lock(&base->lock);
>         *** DEADLOCK ***
>     3 locks held by perf/695:
>      #0: ffff0080146cd2c8 (&type->i_mutex_dir_key#6){++++}-{4:4}, at: lookup_slow+0x30/0x68
>      #1: ffff80008332b858 (rcu_read_lock){....}-{1:3}, at: blk_mq_run_hw_queue+0xf4/0x1fc
>      #2: ffff008000b6aa18 (&host->lock){-.-.}-{3:3}, at: ata_scsi_queuecmd+0x28/0x88
>     stack backtrace:
>     CPU: 3 UID: 0 PID: 695 Comm: perf Not tainted 7.1.0-rc2-pnmi #465 PREEMPT
>     Hardware name: ARM LTD Morello System Development Platform, BIOS EDK II Mar  7 2024
>     Call trace:
>      show_stack+0x18/0x24 (C)
>      dump_stack_lvl+0xc4/0x148
>      dump_stack+0x18/0x24
>      print_usage_bug.part.0+0x29c/0x364
>      lock_acquire+0x364/0x40c
>      _raw_spin_lock_irqsave+0x68/0xb0
>      lock_timer_base+0x6c/0xac
>      add_timer_on+0x78/0x16c
>      add_interrupt_randomness+0x124/0x134
>      handle_percpu_devid_irq+0xd4/0x16c
>      handle_irq_desc+0x40/0x58
>      generic_handle_domain_nmi+0x28/0x50
>      __gic_handle_nmi.isra.0+0x4c/0xa0
>      gic_handle_irq+0x38/0x2bc
>      call_on_irq_stack+0x30/0x48
>      do_interrupt_handler+0x80/0x98
>      el1_interrupt+0x90/0xac
>      el1h_64_irq_handler+0x18/0x24
>      el1h_64_irq+0x80/0x84
>      [...]
> 
> During review, Thomas pointed out it wouldn't be safe for
> handle_percpu_devid_irq() to call add_interrupt_randomness() if it was
> used to handle NMIs:
> 
>   https://lore.kernel.org/lkml/87bjgik042.ffs@tglx/
> 
> ... but evidently people missed that handle_percpu_devid_irq() *is* used
> for NMIs.
> 
> While it might seem that we should handle NMIs with a separate
> handle_percpu_devid_nmi() function, for various structural reasons this
> was impractical, and handle_percpu_devid_irq() has been expected to be
> used for NMIs since commits:
> 
>   21bbbc50f398f ("irqchip/gic-v3: Switch high priority PPIs over to handle_percpu_devid_irq()")
>   5ff78c8de9d83 ("genirq: Kill handle_percpu_devid_fasteoi_nmi()")
> 
> Taking the above into account, avoid the deadlock by not calling
> add_interrupt_randomness() when handle_percpu_devid_irq() is called in
> an NMI context. This is consistent with our other NNI handling flows,
> which do not call add_interrupt_randomness().
> 
> At the same time, update the kerneldoc comment to make it clear that
> handle_percpu_devid_irq() can be called in NMI context. The rest of
> handle_percpu_devid_irq() is currently NMI safe and doesn't need to
> change.
> 
> Fixes: fd7400cfcbaa ("genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()")
> Reported-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Justin He <justin.he@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Michael Kelley <mhklinux@outlook.com>
> Cc: Thomas Gleixner <tglx@kernel.org>
> Cc: Vladimir Murzin <vladimir.murzin@arm.com>
> ---
>  kernel/irq/chip.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index 6c9b1dc4e7d46..b635e3c5d5b6b 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -14,6 +14,7 @@
>  #include <linux/interrupt.h>
>  #include <linux/kernel_stat.h>
>  #include <linux/irqdomain.h>
> +#include <linux/preempt.h>
>  #include <linux/random.h>
>  
>  #include <trace/events/irq.h>
> @@ -893,7 +894,10 @@ void handle_percpu_irq(struct irq_desc *desc)
>   *
>   * action->percpu_dev_id is a pointer to percpu variables which
>   * contain the real device id for the cpu on which this handler is
> - * called
> + * called.
> + *
> + * May be used for NMI interrupt lines, and so may be called in IRQ or NMI
> + * context.
>   */
>  void handle_percpu_devid_irq(struct irq_desc *desc)
>  {
> @@ -930,7 +934,8 @@ void handle_percpu_devid_irq(struct irq_desc *desc)
>  			    enabled ? " and unmasked" : "", irq, cpu);
>  	}
>  
> -	add_interrupt_randomness(irq);
> +	if (!in_nmi())
> +		add_interrupt_randomness(irq);

Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>

>  
>  	if (chip->irq_eoi)
>  		chip->irq_eoi(&desc->irq_data);


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] genirq/chip: Don't call add_interrupt_randomness() for NMIs
  2026-05-07 11:05 [PATCH] genirq/chip: Don't call add_interrupt_randomness() for NMIs Mark Rutland
  2026-05-07 11:38 ` Jinjie Ruan
@ 2026-05-07 11:48 ` Marc Zyngier
  2026-05-11 13:00 ` [tip: irq/urgent] " tip-bot2 for Mark Rutland
  2 siblings, 0 replies; 4+ messages in thread
From: Marc Zyngier @ 2026-05-07 11:48 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, Thomas Gleixner, ada.coupriediaz, justin.he,
	mhklinux, vladimir.murzin

On Thu, 07 May 2026 12:05:18 +0100,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> Recently handle_percpu_devid_irq() was changed to call
> add_interrupt_randomness(). This introduced a potential deadlock when
> handle_percpu_devid_irq() is used to handle an NMI, which can be
> detected with lockdep, e.g.
> 
>     ================================
>     WARNING: inconsistent lock state
>     7.1.0-rc2-pnmi #465 Not tainted
>     --------------------------------
>     inconsistent {INITIAL USE} -> {IN-NMI} usage.
>     perf/695 [HC1[1]:SC0[0]:HE0:SE1] takes:
>     ffff00837dfd3a18 (&base->lock){-.-.}-{2:2}, at: lock_timer_base+0x6c/0xac
>     {INITIAL USE} state was registered at:
>       lock_acquire+0x260/0x40c
>       _raw_spin_lock_irqsave+0x68/0xb0
>       lock_timer_base+0x6c/0xac
>       __mod_timer+0x100/0x32c
>       add_timer_global+0x2c/0x40
>       __queue_delayed_work+0xf0/0x140
>       queue_delayed_work_on+0x134/0x138
>       mem_cgroup_css_online+0x30c/0x310
>       online_css+0x34/0x10c
>       cgroup_init_subsys+0x158/0x1c8
>       cgroup_init+0x440/0x524
>       start_kernel+0x888/0x998
>       __primary_switched+0x88/0x90
>     irq event stamp: 62068
>     hardirqs last  enabled at (62067): [<ffff8000801cc0ec>] seqcount_lockdep_reader_access.constprop.0+0xf0/0xfc
>     hardirqs last disabled at (62068): [<ffff80008150e0ac>] _raw_spin_lock_irqsave+0x94/0xb0
>     softirqs last  enabled at (62050): [<ffff800080017614>] put_cpu_fpsimd_context+0x1c/0x4c
>     softirqs last disabled at (62048): [<ffff8000800175c8>] get_cpu_fpsimd_context+0x1c/0x4c
>         other info that might help us debug this:
>     Possible unsafe locking scenario:
>            CPU0
>            ----
>       lock(&base->lock);
>       <Interrupt>
>         lock(&base->lock);
>         *** DEADLOCK ***
>     3 locks held by perf/695:
>      #0: ffff0080146cd2c8 (&type->i_mutex_dir_key#6){++++}-{4:4}, at: lookup_slow+0x30/0x68
>      #1: ffff80008332b858 (rcu_read_lock){....}-{1:3}, at: blk_mq_run_hw_queue+0xf4/0x1fc
>      #2: ffff008000b6aa18 (&host->lock){-.-.}-{3:3}, at: ata_scsi_queuecmd+0x28/0x88
>     stack backtrace:
>     CPU: 3 UID: 0 PID: 695 Comm: perf Not tainted 7.1.0-rc2-pnmi #465 PREEMPT
>     Hardware name: ARM LTD Morello System Development Platform, BIOS EDK II Mar  7 2024
>     Call trace:
>      show_stack+0x18/0x24 (C)
>      dump_stack_lvl+0xc4/0x148
>      dump_stack+0x18/0x24
>      print_usage_bug.part.0+0x29c/0x364
>      lock_acquire+0x364/0x40c
>      _raw_spin_lock_irqsave+0x68/0xb0
>      lock_timer_base+0x6c/0xac
>      add_timer_on+0x78/0x16c
>      add_interrupt_randomness+0x124/0x134
>      handle_percpu_devid_irq+0xd4/0x16c
>      handle_irq_desc+0x40/0x58
>      generic_handle_domain_nmi+0x28/0x50
>      __gic_handle_nmi.isra.0+0x4c/0xa0
>      gic_handle_irq+0x38/0x2bc
>      call_on_irq_stack+0x30/0x48
>      do_interrupt_handler+0x80/0x98
>      el1_interrupt+0x90/0xac
>      el1h_64_irq_handler+0x18/0x24
>      el1h_64_irq+0x80/0x84
>      [...]
> 
> During review, Thomas pointed out it wouldn't be safe for
> handle_percpu_devid_irq() to call add_interrupt_randomness() if it was
> used to handle NMIs:
> 
>   https://lore.kernel.org/lkml/87bjgik042.ffs@tglx/
> 
> ... but evidently people missed that handle_percpu_devid_irq() *is* used
> for NMIs.
> 
> While it might seem that we should handle NMIs with a separate
> handle_percpu_devid_nmi() function, for various structural reasons this
> was impractical, and handle_percpu_devid_irq() has been expected to be
> used for NMIs since commits:
> 
>   21bbbc50f398f ("irqchip/gic-v3: Switch high priority PPIs over to handle_percpu_devid_irq()")
>   5ff78c8de9d83 ("genirq: Kill handle_percpu_devid_fasteoi_nmi()")
> 
> Taking the above into account, avoid the deadlock by not calling
> add_interrupt_randomness() when handle_percpu_devid_irq() is called in
> an NMI context. This is consistent with our other NNI handling flows,
> which do not call add_interrupt_randomness().
> 
> At the same time, update the kerneldoc comment to make it clear that
> handle_percpu_devid_irq() can be called in NMI context. The rest of
> handle_percpu_devid_irq() is currently NMI safe and doesn't need to
> change.
> 
> Fixes: fd7400cfcbaa ("genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()")
> Reported-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Justin He <justin.he@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Michael Kelley <mhklinux@outlook.com>
> Cc: Thomas Gleixner <tglx@kernel.org>
> Cc: Vladimir Murzin <vladimir.murzin@arm.com>

Thanks for catching that one. FWIW:

Reviewed-by: Marc Zyngier <maz@kernel.org>

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip: irq/urgent] genirq/chip: Don't call add_interrupt_randomness() for NMIs
  2026-05-07 11:05 [PATCH] genirq/chip: Don't call add_interrupt_randomness() for NMIs Mark Rutland
  2026-05-07 11:38 ` Jinjie Ruan
  2026-05-07 11:48 ` Marc Zyngier
@ 2026-05-11 13:00 ` tip-bot2 for Mark Rutland
  2 siblings, 0 replies; 4+ messages in thread
From: tip-bot2 for Mark Rutland @ 2026-05-11 13:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Ada Couprie Diaz, Mark Rutland, Thomas Gleixner, Jinjie Ruan,
	Marc Zyngier, x86, linux-kernel

The following commit has been merged into the irq/urgent branch of tip:

Commit-ID:     512718bbc51b851140380b7068ec7365bd039cba
Gitweb:        https://git.kernel.org/tip/512718bbc51b851140380b7068ec7365bd039cba
Author:        Mark Rutland <mark.rutland@arm.com>
AuthorDate:    Thu, 07 May 2026 12:05:18 +01:00
Committer:     Thomas Gleixner <tglx@kernel.org>
CommitterDate: Mon, 11 May 2026 14:56:04 +02:00

genirq/chip: Don't call add_interrupt_randomness() for NMIs

Recently handle_percpu_devid_irq() was changed to call
add_interrupt_randomness(). This introduced a potential deadlock when
handle_percpu_devid_irq() is used to handle an NMI, which can be
detected with lockdep, e.g.

    ================================
    WARNING: inconsistent lock state
    7.1.0-rc2-pnmi #465 Not tainted
    --------------------------------
    inconsistent {INITIAL USE} -> {IN-NMI} usage.
    perf/695 [HC1[1]:SC0[0]:HE0:SE1] takes:
    ffff00837dfd3a18 (&base->lock){-.-.}-{2:2}, at: lock_timer_base+0x6c/0xac
    {INITIAL USE} state was registered at:
      _raw_spin_lock_irqsave+0x68/0xb0
      lock_timer_base+0x6c/0xac
      __mod_timer+0x100/0x32c
      add_timer_global+0x2c/0x40
      __queue_delayed_work+0xf0/0x140
      queue_delayed_work_on+0x134/0x138
      mem_cgroup_css_online+0x30c/0x310
      online_css+0x34/0x10c
      cgroup_init_subsys+0x158/0x1c8
      cgroup_init+0x440/0x524
      start_kernel+0x888/0x998

    other info that might help us debug this:
    Possible unsafe locking scenario:
           CPU0
           ----
      lock(&base->lock);
      <Interrupt>
        lock(&base->lock);
        *** DEADLOCK ***

    Call trace:
     _raw_spin_lock_irqsave+0x68/0xb0
     lock_timer_base+0x6c/0xac
     add_timer_on+0x78/0x16c
     add_interrupt_randomness+0x124/0x134
     handle_percpu_devid_irq+0xd4/0x16c
     handle_irq_desc+0x40/0x58
     generic_handle_domain_nmi+0x28/0x50
     __gic_handle_nmi.isra.0+0x4c/0xa0
     gic_handle_irq+0x38/0x2bc
     call_on_irq_stack+0x30/0x48
     do_interrupt_handler+0x80/0x98
     el1_interrupt+0x90/0xac
     el1h_64_irq_handler+0x18/0x24
     el1h_64_irq+0x80/0x84
     [...]

During review, Thomas pointed out it wouldn't be safe for
handle_percpu_devid_irq() to call add_interrupt_randomness() if it was
used to handle NMIs:

  https://lore.kernel.org/lkml/87bjgik042.ffs@tglx/

... but evidently people missed that handle_percpu_devid_irq() *is* used
for NMIs.

While it might seem that NMIs should be handled with a separate
handle_percpu_devid_nmi() function, for various structural reasons this was
impractical, and handle_percpu_devid_irq() has been expected to be used for
NMIs since commits:

  21bbbc50f398f ("irqchip/gic-v3: Switch high priority PPIs over to handle_percpu_devid_irq()")
  5ff78c8de9d83 ("genirq: Kill handle_percpu_devid_fasteoi_nmi()")

Taking the above into account, avoid the deadlock by not calling
add_interrupt_randomness() when handle_percpu_devid_irq() is called in an
NMI context. This is consistent with other NNI handling flows, which do not
call add_interrupt_randomness().

At the same time, update the kernel-doc comment to make it clear that
handle_percpu_devid_irq() can be called in NMI context. The rest of
handle_percpu_devid_irq() is currently NMI safe and doesn't need to change.

Fixes: fd7400cfcbaa ("genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()")
Reported-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/20260507110518.3128248-1-mark.rutland@arm.com
---
 kernel/irq/chip.c |  9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 6c9b1dc..b635e3c 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -14,6 +14,7 @@
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
 #include <linux/irqdomain.h>
+#include <linux/preempt.h>
 #include <linux/random.h>
 
 #include <trace/events/irq.h>
@@ -893,7 +894,10 @@ void handle_percpu_irq(struct irq_desc *desc)
  *
  * action->percpu_dev_id is a pointer to percpu variables which
  * contain the real device id for the cpu on which this handler is
- * called
+ * called.
+ *
+ * May be used for NMI interrupt lines, and so may be called in IRQ or NMI
+ * context.
  */
 void handle_percpu_devid_irq(struct irq_desc *desc)
 {
@@ -930,7 +934,8 @@ void handle_percpu_devid_irq(struct irq_desc *desc)
 			    enabled ? " and unmasked" : "", irq, cpu);
 	}
 
-	add_interrupt_randomness(irq);
+	if (!in_nmi())
+		add_interrupt_randomness(irq);
 
 	if (chip->irq_eoi)
 		chip->irq_eoi(&desc->irq_data);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-11 13:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07 11:05 [PATCH] genirq/chip: Don't call add_interrupt_randomness() for NMIs Mark Rutland
2026-05-07 11:38 ` Jinjie Ruan
2026-05-07 11:48 ` Marc Zyngier
2026-05-11 13:00 ` [tip: irq/urgent] " tip-bot2 for Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.