linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] locking: detect spin_lock_irq() call with disabled interrupts
@ 2025-06-06  9:57 Pavel Tikhomirov
  2025-06-06 10:58 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Pavel Tikhomirov @ 2025-06-06  9:57 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Waiman Long,
	Kees Cook, Joel Granados, Andrew Morton
  Cc: Konstantin Khorenko, Denis Lunev, Aleksandr Mikhalitsyn,
	Pavel Tikhomirov, linux-kernel, linux-fsdevel, kernel

This is intended to easily detect irq spinlock self-deadlocks like:

  spin_lock_irq(A);
  spin_lock_irq(B);
  spin_unlock_irq(B);
    IRQ {
      spin_lock(A); <- deadlocks
      spin_unlock(A);
    }
  spin_unlock_irq(A);

Recently we saw this kind of deadlock on our partner's node:

PID: 408      TASK: ffff8eee0870ca00  CPU: 36   COMMAND: "kworker/36:1H"
 #0 [fffffe3861831e60] crash_nmi_callback at ffffffff97269e31
 #1 [fffffe3861831e68] nmi_handle at ffffffff972300bb
 #2 [fffffe3861831eb0] default_do_nmi at ffffffff97e9e000
 #3 [fffffe3861831ed0] exc_nmi at ffffffff97e9e211
 #4 [fffffe3861831ef0] end_repeat_nmi at ffffffff98001639
    [exception RIP: native_queued_spin_lock_slowpath+638]
    RIP: ffffffff97eb31ae  RSP: ffffb1c8cd2a4d40  RFLAGS: 00000046
    RAX: 0000000000000000  RBX: ffff8f2dffb34780  RCX: 0000000000940000
    RDX: 000000000000002a  RSI: 0000000000ac0000  RDI: ffff8eaed4eb81c0
    RBP: ffff8eaed4eb81c0   R8: 0000000000000000   R9: ffff8f2dffaf3438
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000024  R14: 0000000000000000  R15: ffffd1c8bfb24b80
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #5 [ffffb1c8cd2a4d40] native_queued_spin_lock_slowpath at ffffffff97eb31ae
 #6 [ffffb1c8cd2a4d60] _raw_spin_lock_irqsave at ffffffff97eb2730
 #7 [ffffb1c8cd2a4d70] __wake_up at ffffffff9737c02d
 #8 [ffffb1c8cd2a4da0] sbitmap_queue_wake_up at ffffffff9786c74d
 #9 [ffffb1c8cd2a4dc8] sbitmap_queue_clear at ffffffff9786cc97
--- <IRQ stack> ---
    [exception RIP: _raw_spin_unlock_irq+20]
    RIP: ffffffff97eb2e84  RSP: ffffb1c8cd90fd18  RFLAGS: 00000283
    RAX: 0000000000000001  RBX: ffff8eafb68efb40  RCX: 0000000000000001
    RDX: 0000000000000008  RSI: 0000000000000061  RDI: ffff8eafb06c3c70
    RBP: ffff8eee7af43000   R8: ffff8eaed4eb81c8   R9: ffff8eaed4eb81c8
    R10: 0000000000000008  R11: 0000000000000008  R12: 0000000000000000
    R13: ffff8eafb06c3bd0  R14: ffff8eafb06c3bc0  R15: ffff8eaed4eb81c0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Luckily it was already fixed in mainstream by:
commit b313a8c83551 ("block: Fix lockdep warning in blk_mq_mark_tag_wait")

Currently if we are unlucky we may miss such a deadlock on our testing
system as it is racy and it depends on the specific interrupt handler
appearing at the right place and at the right time. So this patch tries
to detect the problem despite the absence of the interrupt.

If we see spin_lock_irq under interrupts already disabled we can assume
that it has paired spin_unlock_irq which would reenable interrupts where
they should not be reenabled. So we report a warning for it.

Same thing on spin_unlock_irq even if we were lucky and there was no
deadlock let's report if interrupts were enabled.

Let's make this functionality catch one problem and then be disabled, to
prevent from spamming kernel log with warnings. Also let's add sysctl
kernel.debug_spin_lock_irq_with_disabled_interrupts to reenable it if
needed. Also let's add a by default enabled configuration option
DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT, in case we will
need this on boot.

Yes Lockdep can detect that, if it sees both the interrupt stack and the
regular stack where we can get into interrupt with spinlock held. But
with this approach we can detect the problem even without ever getting
into interrupt stack. And also this functionality seems to be more
lightweight then Lockdep as it does not need to maintain lock dependency
graph.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

--
Tested with https://github.com/Snorch/spinlock-irq-test-module.
---
 include/linux/spinlock.h  | 21 +++++++++++++++++++++
 kernel/locking/spinlock.c |  6 ++++++
 kernel/sysctl.c           |  9 +++++++++
 lib/Kconfig.debug         | 12 ++++++++++++
 4 files changed, 48 insertions(+)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index d3561c4a080e..b8ebccaa5062 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -371,8 +371,21 @@ do {									\
 	raw_spin_lock_nest_lock(spinlock_check(lock), nest_lock);	\
 } while (0)
 
+#ifdef CONFIG_DEBUG_SPINLOCK
+DECLARE_STATIC_KEY_MAYBE(CONFIG_DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT,
+			 debug_spin_lock_irq_with_disabled_interrupts);
+#endif
+
 static __always_inline void spin_lock_irq(spinlock_t *lock)
 {
+#ifdef CONFIG_DEBUG_SPINLOCK
+	if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
+		if (raw_irqs_disabled()) {
+			static_branch_disable(&debug_spin_lock_irq_with_disabled_interrupts);
+			WARN(1, "spin_lock_irq() called with irqs disabled!\n");
+		}
+	}
+#endif
 	raw_spin_lock_irq(&lock->rlock);
 }
 
@@ -398,6 +411,14 @@ static __always_inline void spin_unlock_bh(spinlock_t *lock)
 
 static __always_inline void spin_unlock_irq(spinlock_t *lock)
 {
+#ifdef CONFIG_DEBUG_SPINLOCK
+	if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
+		if (!raw_irqs_disabled()) {
+			static_branch_disable(&debug_spin_lock_irq_with_disabled_interrupts);
+			WARN(1, "spin_unlock_irq() called with irqs enabled!\n");
+		}
+	}
+#endif
 	raw_spin_unlock_irq(&lock->rlock);
 }
 
diff --git a/kernel/locking/spinlock.c b/kernel/locking/spinlock.c
index 7685defd7c52..6ec4a788f53c 100644
--- a/kernel/locking/spinlock.c
+++ b/kernel/locking/spinlock.c
@@ -22,6 +22,12 @@
 #include <linux/debug_locks.h>
 #include <linux/export.h>
 
+#ifdef CONFIG_DEBUG_SPINLOCK
+DEFINE_STATIC_KEY_MAYBE(CONFIG_DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT,
+			debug_spin_lock_irq_with_disabled_interrupts);
+EXPORT_SYMBOL(debug_spin_lock_irq_with_disabled_interrupts);
+#endif
+
 #ifdef CONFIG_MMIOWB
 #ifndef arch_mmiowb_state
 DEFINE_PER_CPU(struct mmiowb_state, __mmiowb_state);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 9b4f0cff76ea..1e3cca2e3c8f 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -50,6 +50,7 @@
 #include <linux/sched/sysctl.h>
 #include <linux/mount.h>
 #include <linux/pid.h>
+#include <linux/spinlock.h>
 
 #include "../lib/kstrtox.h"
 
@@ -1758,6 +1759,14 @@ static const struct ctl_table kern_table[] = {
 		.extra2		= SYSCTL_INT_MAX,
 	},
 #endif
+#ifdef CONFIG_DEBUG_SPINLOCK
+	{
+		.procname	= "debug_spin_lock_irq_with_disabled_interrupts",
+		.data		= &debug_spin_lock_irq_with_disabled_interrupts,
+		.mode		= 0644,
+		.proc_handler	= proc_do_static_key,
+	},
+#endif
 };
 
 int __init sysctl_init_bases(void)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index ebe33181b6e6..c4834f4c9d51 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1465,6 +1465,18 @@ config DEBUG_SPINLOCK
 	  best used in conjunction with the NMI watchdog so that spinlock
 	  deadlocks are also debuggable.
 
+config DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT
+	bool "Detect spin_(un)lock_irq() call with disabled(enabled) interrupts"
+	depends on DEBUG_SPINLOCK
+	help
+	  Say Y here to detect spin_lock_irq() and spin_unlock_irq() calls
+	  with disabled (enabled) interrupts. This helps detecting bugs
+	  where the code is not using the right locking primitives. E.g.
+	  using spin_lock_irq() twice in a row (on different locks). And thus
+	  code can reenable interrupts where they should be disabled and lead
+	  to deadlock.
+	  Say N if you are unsure.
+
 config DEBUG_MUTEXES
 	bool "Mutex debugging: basic checks"
 	depends on DEBUG_KERNEL && !PREEMPT_RT
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] locking: detect spin_lock_irq() call with disabled interrupts
  2025-06-06  9:57 [PATCH] locking: detect spin_lock_irq() call with disabled interrupts Pavel Tikhomirov
@ 2025-06-06 10:58 ` Peter Zijlstra
  2025-06-09  4:25   ` Pavel Tikhomirov
  2025-06-06 16:09 ` kernel test robot
  2025-06-06 21:09 ` kernel test robot
  2 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2025-06-06 10:58 UTC (permalink / raw)
  To: Pavel Tikhomirov
  Cc: Ingo Molnar, Will Deacon, Boqun Feng, Waiman Long, Kees Cook,
	Joel Granados, Andrew Morton, Konstantin Khorenko, Denis Lunev,
	Aleksandr Mikhalitsyn, linux-kernel, linux-fsdevel, kernel

On Fri, Jun 06, 2025 at 05:57:23PM +0800, Pavel Tikhomirov wrote:
> This is intended to easily detect irq spinlock self-deadlocks like:
> 
>   spin_lock_irq(A);
>   spin_lock_irq(B);
>   spin_unlock_irq(B);
>     IRQ {
>       spin_lock(A); <- deadlocks
>       spin_unlock(A);
>     }
>   spin_unlock_irq(A);
> 
> Recently we saw this kind of deadlock on our partner's node:
> 
> PID: 408      TASK: ffff8eee0870ca00  CPU: 36   COMMAND: "kworker/36:1H"
>  #0 [fffffe3861831e60] crash_nmi_callback at ffffffff97269e31
>  #1 [fffffe3861831e68] nmi_handle at ffffffff972300bb
>  #2 [fffffe3861831eb0] default_do_nmi at ffffffff97e9e000
>  #3 [fffffe3861831ed0] exc_nmi at ffffffff97e9e211
>  #4 [fffffe3861831ef0] end_repeat_nmi at ffffffff98001639
>     [exception RIP: native_queued_spin_lock_slowpath+638]
>     RIP: ffffffff97eb31ae  RSP: ffffb1c8cd2a4d40  RFLAGS: 00000046
>     RAX: 0000000000000000  RBX: ffff8f2dffb34780  RCX: 0000000000940000
>     RDX: 000000000000002a  RSI: 0000000000ac0000  RDI: ffff8eaed4eb81c0
>     RBP: ffff8eaed4eb81c0   R8: 0000000000000000   R9: ffff8f2dffaf3438
>     R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
>     R13: 0000000000000024  R14: 0000000000000000  R15: ffffd1c8bfb24b80
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> --- <NMI exception stack> ---
>  #5 [ffffb1c8cd2a4d40] native_queued_spin_lock_slowpath at ffffffff97eb31ae
>  #6 [ffffb1c8cd2a4d60] _raw_spin_lock_irqsave at ffffffff97eb2730
>  #7 [ffffb1c8cd2a4d70] __wake_up at ffffffff9737c02d
>  #8 [ffffb1c8cd2a4da0] sbitmap_queue_wake_up at ffffffff9786c74d
>  #9 [ffffb1c8cd2a4dc8] sbitmap_queue_clear at ffffffff9786cc97
> --- <IRQ stack> ---
>     [exception RIP: _raw_spin_unlock_irq+20]
>     RIP: ffffffff97eb2e84  RSP: ffffb1c8cd90fd18  RFLAGS: 00000283
>     RAX: 0000000000000001  RBX: ffff8eafb68efb40  RCX: 0000000000000001
>     RDX: 0000000000000008  RSI: 0000000000000061  RDI: ffff8eafb06c3c70
>     RBP: ffff8eee7af43000   R8: ffff8eaed4eb81c8   R9: ffff8eaed4eb81c8
>     R10: 0000000000000008  R11: 0000000000000008  R12: 0000000000000000
>     R13: ffff8eafb06c3bd0  R14: ffff8eafb06c3bc0  R15: ffff8eaed4eb81c0
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> 
> Luckily it was already fixed in mainstream by:
> commit b313a8c83551 ("block: Fix lockdep warning in blk_mq_mark_tag_wait")
> 
> Currently if we are unlucky we may miss such a deadlock on our testing
> system as it is racy and it depends on the specific interrupt handler
> appearing at the right place and at the right time. So this patch tries
> to detect the problem despite the absence of the interrupt.
> 
> If we see spin_lock_irq under interrupts already disabled we can assume
> that it has paired spin_unlock_irq which would reenable interrupts where
> they should not be reenabled. So we report a warning for it.
> 
> Same thing on spin_unlock_irq even if we were lucky and there was no
> deadlock let's report if interrupts were enabled.
> 
> Let's make this functionality catch one problem and then be disabled, to
> prevent from spamming kernel log with warnings. Also let's add sysctl
> kernel.debug_spin_lock_irq_with_disabled_interrupts to reenable it if
> needed. Also let's add a by default enabled configuration option
> DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT, in case we will
> need this on boot.
> 
> Yes Lockdep can detect that, if it sees both the interrupt stack and the
> regular stack where we can get into interrupt with spinlock held. But
> with this approach we can detect the problem even without ever getting
> into interrupt stack. And also this functionality seems to be more
> lightweight then Lockdep as it does not need to maintain lock dependency
> graph.

So why do we need DEBUG_SPINLOCK code, that's injected into every single
callsite, if lockdep can already detect this?



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] locking: detect spin_lock_irq() call with disabled interrupts
  2025-06-06  9:57 [PATCH] locking: detect spin_lock_irq() call with disabled interrupts Pavel Tikhomirov
  2025-06-06 10:58 ` Peter Zijlstra
@ 2025-06-06 16:09 ` kernel test robot
  2025-06-06 21:09 ` kernel test robot
  2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2025-06-06 16:09 UTC (permalink / raw)
  To: Pavel Tikhomirov, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Boqun Feng, Waiman Long, Kees Cook, Joel Granados, Andrew Morton
  Cc: oe-kbuild-all, Linux Memory Management List, Konstantin Khorenko,
	Denis Lunev, Aleksandr Mikhalitsyn, Pavel Tikhomirov,
	linux-kernel, linux-fsdevel, kernel

Hi Pavel,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/locking/core]
[also build test ERROR on sysctl/sysctl-next akpm-mm/mm-nonmm-unstable tip/master linus/master v6.15 next-20250606]
[cannot apply to mcgrof/sysctl-next tip/auto-latest]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pavel-Tikhomirov/locking-detect-spin_lock_irq-call-with-disabled-interrupts/20250606-175911
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20250606095741.46775-1-ptikhomirov%40virtuozzo.com
patch subject: [PATCH] locking: detect spin_lock_irq() call with disabled interrupts
config: riscv-randconfig-002-20250606 (https://download.01.org/0day-ci/archive/20250606/202506062318.7g54PAh3-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250606/202506062318.7g54PAh3-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506062318.7g54PAh3-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/mmzone.h:8,
                    from include/linux/gfp.h:7,
                    from include/linux/mm.h:7,
                    from arch/riscv/kernel/asm-offsets.c:8:
>> include/linux/spinlock.h:375:1: warning: data definition has no type or storage class
    DECLARE_STATIC_KEY_MAYBE(CONFIG_DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT,
    ^~~~~~~~~~~~~~~~~~~~~~~~
>> include/linux/spinlock.h:375:1: error: type defaults to 'int' in declaration of 'DECLARE_STATIC_KEY_MAYBE' [-Werror=implicit-int]
>> include/linux/spinlock.h:376:5: warning: parameter names (without types) in function declaration
        debug_spin_lock_irq_with_disabled_interrupts);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/spinlock.h: In function 'spin_lock_irq':
>> include/linux/spinlock.h:382:6: error: implicit declaration of function 'static_branch_unlikely' [-Werror=implicit-function-declaration]
     if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
         ^~~~~~~~~~~~~~~~~~~~~~
>> include/linux/spinlock.h:382:30: error: 'debug_spin_lock_irq_with_disabled_interrupts' undeclared (first use in this function)
     if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/spinlock.h:382:30: note: each undeclared identifier is reported only once for each function it appears in
>> include/linux/spinlock.h:384:4: error: implicit declaration of function 'static_branch_disable'; did you mean 'stack_trace_save'? [-Werror=implicit-function-declaration]
       static_branch_disable(&debug_spin_lock_irq_with_disabled_interrupts);
       ^~~~~~~~~~~~~~~~~~~~~
       stack_trace_save
   include/linux/spinlock.h: In function 'spin_unlock_irq':
   include/linux/spinlock.h:415:30: error: 'debug_spin_lock_irq_with_disabled_interrupts' undeclared (first use in this function)
     if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors
   make[3]: *** [scripts/Makefile.build:98: arch/riscv/kernel/asm-offsets.s] Error 1 shuffle=2202685202
   make[3]: Target 'prepare' not remade because of errors.
   make[2]: *** [Makefile:1275: prepare0] Error 2 shuffle=2202685202
   make[2]: Target 'prepare' not remade because of errors.
   make[1]: *** [Makefile:248: __sub-make] Error 2 shuffle=2202685202
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:248: __sub-make] Error 2 shuffle=2202685202
   make: Target 'prepare' not remade because of errors.


vim +375 include/linux/spinlock.h

   373	
   374	#ifdef CONFIG_DEBUG_SPINLOCK
 > 375	DECLARE_STATIC_KEY_MAYBE(CONFIG_DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT,
 > 376				 debug_spin_lock_irq_with_disabled_interrupts);
   377	#endif
   378	
   379	static __always_inline void spin_lock_irq(spinlock_t *lock)
   380	{
   381	#ifdef CONFIG_DEBUG_SPINLOCK
 > 382		if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
   383			if (raw_irqs_disabled()) {
 > 384				static_branch_disable(&debug_spin_lock_irq_with_disabled_interrupts);
   385				WARN(1, "spin_lock_irq() called with irqs disabled!\n");
   386			}
   387		}
   388	#endif
   389		raw_spin_lock_irq(&lock->rlock);
   390	}
   391	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] locking: detect spin_lock_irq() call with disabled interrupts
  2025-06-06  9:57 [PATCH] locking: detect spin_lock_irq() call with disabled interrupts Pavel Tikhomirov
  2025-06-06 10:58 ` Peter Zijlstra
  2025-06-06 16:09 ` kernel test robot
@ 2025-06-06 21:09 ` kernel test robot
  2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2025-06-06 21:09 UTC (permalink / raw)
  To: Pavel Tikhomirov, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Boqun Feng, Waiman Long, Kees Cook, Joel Granados, Andrew Morton
  Cc: llvm, oe-kbuild-all, Linux Memory Management List,
	Konstantin Khorenko, Denis Lunev, Aleksandr Mikhalitsyn,
	Pavel Tikhomirov, linux-kernel, linux-fsdevel, kernel

Hi Pavel,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tip/locking/core]
[also build test WARNING on sysctl/sysctl-next akpm-mm/mm-nonmm-unstable tip/master linus/master v6.15 next-20250606]
[cannot apply to mcgrof/sysctl-next tip/auto-latest]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pavel-Tikhomirov/locking-detect-spin_lock_irq-call-with-disabled-interrupts/20250606-175911
base:   tip/locking/core
patch link:    https://lore.kernel.org/r/20250606095741.46775-1-ptikhomirov%40virtuozzo.com
patch subject: [PATCH] locking: detect spin_lock_irq() call with disabled interrupts
config: arm-randconfig-003-20250606 (https://download.01.org/0day-ci/archive/20250607/202506070405.MCweX7O0-lkp@intel.com/config)
compiler: clang version 21.0.0git (https://github.com/llvm/llvm-project f819f46284f2a79790038e1f6649172789734ae8)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250607/202506070405.MCweX7O0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506070405.MCweX7O0-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from arch/arm/plat-orion/irq.c:13:
   In file included from include/linux/irq.h:14:
   include/linux/spinlock.h:375:1: error: type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int [-Wimplicit-int]
     375 | DECLARE_STATIC_KEY_MAYBE(CONFIG_DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT,
         | ^
         | int
   include/linux/spinlock.h:375:26: error: a parameter list without types is only allowed in a function definition
     375 | DECLARE_STATIC_KEY_MAYBE(CONFIG_DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT,
         |                          ^
   include/linux/spinlock.h:382:6: error: call to undeclared function 'static_branch_unlikely'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     382 |         if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
         |             ^
   include/linux/spinlock.h:382:30: error: use of undeclared identifier 'debug_spin_lock_irq_with_disabled_interrupts'
     382 |         if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
         |                                     ^
   include/linux/spinlock.h:384:4: error: call to undeclared function 'static_branch_disable'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     384 |                         static_branch_disable(&debug_spin_lock_irq_with_disabled_interrupts);
         |                         ^
   include/linux/spinlock.h:384:27: error: use of undeclared identifier 'debug_spin_lock_irq_with_disabled_interrupts'
     384 |                         static_branch_disable(&debug_spin_lock_irq_with_disabled_interrupts);
         |                                                ^
   include/linux/spinlock.h:415:6: error: call to undeclared function 'static_branch_unlikely'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     415 |         if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
         |             ^
   include/linux/spinlock.h:415:30: error: use of undeclared identifier 'debug_spin_lock_irq_with_disabled_interrupts'
     415 |         if (static_branch_unlikely(&debug_spin_lock_irq_with_disabled_interrupts)) {
         |                                     ^
   include/linux/spinlock.h:417:4: error: call to undeclared function 'static_branch_disable'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     417 |                         static_branch_disable(&debug_spin_lock_irq_with_disabled_interrupts);
         |                         ^
   include/linux/spinlock.h:417:27: error: use of undeclared identifier 'debug_spin_lock_irq_with_disabled_interrupts'
     417 |                         static_branch_disable(&debug_spin_lock_irq_with_disabled_interrupts);
         |                                                ^
>> arch/arm/plat-orion/irq.c:37:29: warning: shift count >= width of type [-Wshift-count-overflow]
      37 |         irq_setup_generic_chip(gc, IRQ_MSK(32), IRQ_GC_INIT_MASK_CACHE,
         |                                    ^~~~~~~~~~~
   include/linux/irq.h:1223:41: note: expanded from macro 'IRQ_MSK'
    1223 | #define IRQ_MSK(n) (u32)((n) < 32 ? ((1 << (n)) - 1) : UINT_MAX)
         |                                         ^  ~~~
   1 warning and 10 errors generated.


vim +37 arch/arm/plat-orion/irq.c

f28d7de6bd4d41 Sebastian Hesselbarth 2014-01-16  21  
01eb569823792a Lennert Buytenhek     2008-03-27  22  void __init orion_irq_init(unsigned int irq_start, void __iomem *maskaddr)
01eb569823792a Lennert Buytenhek     2008-03-27  23  {
e59347a1d15c0b Thomas Gleixner       2011-04-14  24  	struct irq_chip_generic *gc;
e59347a1d15c0b Thomas Gleixner       2011-04-14  25  	struct irq_chip_type *ct;
01eb569823792a Lennert Buytenhek     2008-03-27  26  
01eb569823792a Lennert Buytenhek     2008-03-27  27  	/*
01eb569823792a Lennert Buytenhek     2008-03-27  28  	 * Mask all interrupts initially.
01eb569823792a Lennert Buytenhek     2008-03-27  29  	 */
01eb569823792a Lennert Buytenhek     2008-03-27  30  	writel(0, maskaddr);
01eb569823792a Lennert Buytenhek     2008-03-27  31  
e59347a1d15c0b Thomas Gleixner       2011-04-14  32  	gc = irq_alloc_generic_chip("orion_irq", 1, irq_start, maskaddr,
f38c02f3b33865 Thomas Gleixner       2011-03-24  33  				    handle_level_irq);
e59347a1d15c0b Thomas Gleixner       2011-04-14  34  	ct = gc->chip_types;
e59347a1d15c0b Thomas Gleixner       2011-04-14  35  	ct->chip.irq_mask = irq_gc_mask_clr_bit;
e59347a1d15c0b Thomas Gleixner       2011-04-14  36  	ct->chip.irq_unmask = irq_gc_mask_set_bit;
e59347a1d15c0b Thomas Gleixner       2011-04-14 @37  	irq_setup_generic_chip(gc, IRQ_MSK(32), IRQ_GC_INIT_MASK_CACHE,

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] locking: detect spin_lock_irq() call with disabled interrupts
  2025-06-06 10:58 ` Peter Zijlstra
@ 2025-06-09  4:25   ` Pavel Tikhomirov
  0 siblings, 0 replies; 5+ messages in thread
From: Pavel Tikhomirov @ 2025-06-09  4:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Will Deacon, Boqun Feng, Waiman Long, Kees Cook,
	Joel Granados, Andrew Morton, Konstantin Khorenko, Denis Lunev,
	Aleksandr Mikhalitsyn, linux-kernel, linux-fsdevel



On 6/6/25 18:58, Peter Zijlstra wrote:
> On Fri, Jun 06, 2025 at 05:57:23PM +0800, Pavel Tikhomirov wrote:
>> This is intended to easily detect irq spinlock self-deadlocks like:
>>
>>    spin_lock_irq(A);
>>    spin_lock_irq(B);
>>    spin_unlock_irq(B);
>>      IRQ {
>>        spin_lock(A); <- deadlocks
>>        spin_unlock(A);
>>      }
>>    spin_unlock_irq(A);
>>
>> Recently we saw this kind of deadlock on our partner's node:
>>
>> PID: 408      TASK: ffff8eee0870ca00  CPU: 36   COMMAND: "kworker/36:1H"
>>   #0 [fffffe3861831e60] crash_nmi_callback at ffffffff97269e31
>>   #1 [fffffe3861831e68] nmi_handle at ffffffff972300bb
>>   #2 [fffffe3861831eb0] default_do_nmi at ffffffff97e9e000
>>   #3 [fffffe3861831ed0] exc_nmi at ffffffff97e9e211
>>   #4 [fffffe3861831ef0] end_repeat_nmi at ffffffff98001639
>>      [exception RIP: native_queued_spin_lock_slowpath+638]
>>      RIP: ffffffff97eb31ae  RSP: ffffb1c8cd2a4d40  RFLAGS: 00000046
>>      RAX: 0000000000000000  RBX: ffff8f2dffb34780  RCX: 0000000000940000
>>      RDX: 000000000000002a  RSI: 0000000000ac0000  RDI: ffff8eaed4eb81c0
>>      RBP: ffff8eaed4eb81c0   R8: 0000000000000000   R9: ffff8f2dffaf3438
>>      R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
>>      R13: 0000000000000024  R14: 0000000000000000  R15: ffffd1c8bfb24b80
>>      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>> --- <NMI exception stack> ---
>>   #5 [ffffb1c8cd2a4d40] native_queued_spin_lock_slowpath at ffffffff97eb31ae
>>   #6 [ffffb1c8cd2a4d60] _raw_spin_lock_irqsave at ffffffff97eb2730
>>   #7 [ffffb1c8cd2a4d70] __wake_up at ffffffff9737c02d
>>   #8 [ffffb1c8cd2a4da0] sbitmap_queue_wake_up at ffffffff9786c74d
>>   #9 [ffffb1c8cd2a4dc8] sbitmap_queue_clear at ffffffff9786cc97
>> --- <IRQ stack> ---
>>      [exception RIP: _raw_spin_unlock_irq+20]
>>      RIP: ffffffff97eb2e84  RSP: ffffb1c8cd90fd18  RFLAGS: 00000283
>>      RAX: 0000000000000001  RBX: ffff8eafb68efb40  RCX: 0000000000000001
>>      RDX: 0000000000000008  RSI: 0000000000000061  RDI: ffff8eafb06c3c70
>>      RBP: ffff8eee7af43000   R8: ffff8eaed4eb81c8   R9: ffff8eaed4eb81c8
>>      R10: 0000000000000008  R11: 0000000000000008  R12: 0000000000000000
>>      R13: ffff8eafb06c3bd0  R14: ffff8eafb06c3bc0  R15: ffff8eaed4eb81c0
>>      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>>
>> Luckily it was already fixed in mainstream by:
>> commit b313a8c83551 ("block: Fix lockdep warning in blk_mq_mark_tag_wait")
>>
>> Currently if we are unlucky we may miss such a deadlock on our testing
>> system as it is racy and it depends on the specific interrupt handler
>> appearing at the right place and at the right time. So this patch tries
>> to detect the problem despite the absence of the interrupt.
>>
>> If we see spin_lock_irq under interrupts already disabled we can assume
>> that it has paired spin_unlock_irq which would reenable interrupts where
>> they should not be reenabled. So we report a warning for it.
>>
>> Same thing on spin_unlock_irq even if we were lucky and there was no
>> deadlock let's report if interrupts were enabled.
>>
>> Let's make this functionality catch one problem and then be disabled, to
>> prevent from spamming kernel log with warnings. Also let's add sysctl
>> kernel.debug_spin_lock_irq_with_disabled_interrupts to reenable it if
>> needed. Also let's add a by default enabled configuration option
>> DEBUG_SPINLOCK_IRQ_WITH_DISABLED_INTERRUPTS_BY_DEFAULT, in case we will
>> need this on boot.
>>
>> Yes Lockdep can detect that, if it sees both the interrupt stack and the
>> regular stack where we can get into interrupt with spinlock held. But
>> with this approach we can detect the problem even without ever getting
>> into interrupt stack. And also this functionality seems to be more
>> lightweight then Lockdep as it does not need to maintain lock dependency
>> graph.
> 
> So why do we need DEBUG_SPINLOCK code, that's injected into every single
> callsite, if lockdep can already detect this?

Hello Peter,

Thank you for your reply!

We are trying to figure out a way to improve the detection of similar 
cases in future, in our tests and even in some cases in production. I 
see that in mainstream this issue (commit b313a8c83551) was detected by 
Lockdep and fixed in the next release, that is very fast. But, sadly, we 
didn't catch it in our QA testing cycles (including runs with debug 
kernel with Lockdep), and it led to repeated complete nodes hang in 
production of one of our partners.

1. This patch, arguably, is less performance consuming than Lockdep (I 
don't have performance results, but my motivations is: we don't need to 
save stacks and manage lock topology tree on each lock, thus saving 
memory and some cpu cycles). Because Lockdep makes tests slow, we don't 
run all tests with it, and hopefully we would be able to run all tests 
with this small check, and it will improve detection probability.

2. If on a production node we have a reproduce of such a deadlock on 
spinlock and we can't easily find the real spinlock owner in crashdump, 
we can probably enable this small feature there and catch the "guilty" 
stack directly on next reproduce. Running Lockdep there will definitely 
be a no go. The suggested checks are really quite small and lightweight 
and we think even release kernels can have those checks compiled-in 
(with static key disabled by default surely) and those checks can be 
enabled without extra reboot on affected nodes and with very small 
performance footprint.

3. Lockdep generates false positives, from my experience roughly <10% of 
what it reports is an actual thing, but maybe I'm just unlucky (or not 
understanding something). And then Lockdep detects a problem there is no 
way to search for the next error (AFAIU lock dependency tree gets a 
cycle), you have to reboot, if I don't miss something. So we have to fix 
even false positives if we want to get possible new detections later on 
the same test. (Or maybe disable Lockdep tracing with lockdep_off, but 
that can skip a real problem nearby.)

4. If lock in interrupt code path will not be triggered in test Lockdep 
will not detect the deadlock. But with this check we only need to 
trigger the non-interrupt code path to detect the problem (having 
enabled interrupt brefore spin_unlock_irq* is always a bad sign) and 
possible deadlock. So we again improve probability of detecting such 
cases by a small fraction.


> 
> 

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-06-09  4:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-06  9:57 [PATCH] locking: detect spin_lock_irq() call with disabled interrupts Pavel Tikhomirov
2025-06-06 10:58 ` Peter Zijlstra
2025-06-09  4:25   ` Pavel Tikhomirov
2025-06-06 16:09 ` kernel test robot
2025-06-06 21:09 ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).