All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zqiang" <qiang.zhang@linux.dev>
To: paulmck@kernel.org, "kernel test robot" <oliver.sang@intel.com>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	rcu@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [paulmckrcu:dev.2025.08.21a] [rcu] 8bd9383727: WARNING:possible_circular_locking_dependency_detected
Date: Sat, 30 Aug 2025 02:38:35 +0000	[thread overview]
Message-ID: <eb1e5ab00253fdae5ba5aa4c97d60a79d357dbfd@linux.dev> (raw)
In-Reply-To: <2853a174-76e4-440b-bfc1-71ea30694822@paulmck-laptop>

> 
> On Tue, Aug 26, 2025 at 04:47:22PM +0800, kernel test robot wrote:
> 
> > 
> > hi, Paul,
> >  
> >  the similar issue still exists on this dev.2025.08.21a branch.
> >  again, if the issue is already fixed on later branches, please just ignore.
> >  thanks
> >  
> >  
> >  Hello,
> >  
> >  kernel test robot noticed "WARNING:possible_circular_locking_dependency_detected" on:
> >  
> >  commit: 8bd9383727068a5a18acfecefbdfa44a7d6bd838 ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast")
> >  https://github.com/paulmckrcu/linux dev.2025.08.21a
> >  
> >  in testcase: rcutorture
> >  version: 
> >  with following parameters:
> >  
> >  runtime: 300s
> >  test: default
> >  torture_type: tasks-tracing
> >  
> >  
> >  
> >  config: x86_64-randconfig-003-20250824
> >  compiler: clang-20
> >  test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> >  
> >  (please refer to attached dmesg/kmsg for entire log/backtrace)
> > 
> Again, apologies for being slow, and thank you for your testing efforts.
> 
> Idiot here forgot about Tiny SRCU, so please see the end of this email
> for an alleged fix. Does it do the trick for you?
> 
>  Thanx, Paul
> 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> >  the same patch/commit), kindly add following tags
> >  | Reported-by: kernel test robot <oliver.sang@intel.com>
> >  | Closes: https://lore.kernel.org/oe-lkp/202508261642.b15eefbb-lkp@intel.com
> >  
> >  
> >  [ 42.365933][ T393] WARNING: possible circular locking dependency detected
> >  [ 42.366428][ T393] 6.17.0-rc1-00035-g8bd938372706 #1 Tainted: G T
> >  [ 42.366985][ T393] ------------------------------------------------------
> >  [ 42.367490][ T393] rcu_torture_rea/393 is trying to acquire lock:
> >  [ 42.367952][ T393] ffffffffad41dc88 (rcu_tasks_trace_srcu_struct.srcu_wq.lock){....}-{2:2}, at: swake_up_one (kernel/sched/swait.c:52 (discriminator 1)) 
> >  [ 42.368775][ T393]
> >  [ 42.368775][ T393] but task is already holding lock:
> >  [ 42.369278][ T393] ffff88813d1ff2e8 (&p->pi_lock){-.-.}-{2:2}, at: rcutorture_one_extend (kernel/rcu/rcutorture.c:?) rcutorture 
> >  [ 42.370043][ T393]
> >  [ 42.370043][ T393] which lock already depends on the new lock.
> >  [ 42.370043][ T393]
> >  [ 42.370755][ T393]
> >  [ 42.370755][ T393] the existing dependency chain (in reverse order) is:
> >  [ 42.371388][ T393]
> >  [ 42.371388][ T393] -> #1 (&p->pi_lock){-.-.}-{2:2}:
> >  [ 42.371903][ T393] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:110 kernel/locking/spinlock.c:162) 
> >  [ 42.372309][ T393] try_to_wake_up (include/linux/spinlock.h:557 (discriminator 1) kernel/sched/core.c:4216 (discriminator 1)) 
> >  [ 42.372669][ T393] swake_up_locked (include/linux/list.h:111) 
> >  [ 42.373029][ T393] swake_up_one (kernel/sched/swait.c:54 (discriminator 1)) 
> >  [ 42.373380][ T393] tasks_tracing_torture_read_unlock (include/linux/srcu.h:408 (discriminator 1) include/linux/rcupdate_trace.h:81 (discriminator 1) kernel/rcu/rcutorture.c:1112 (discriminator 1)) rcutorture 
> >  [ 42.373952][ T393] rcutorture_one_extend (kernel/rcu/rcutorture.c:2141) rcutorture 
> >  [ 42.374452][ T393] rcu_torture_one_read_end (kernel/rcu/rcutorture.c:2357) rcutorture 
> >  [ 42.374976][ T393] rcu_torture_one_read (kernel/rcu/rcutorture.c:?) rcutorture 
> >  [ 42.375460][ T393] rcu_torture_reader (kernel/rcu/rcutorture.c:2443) rcutorture 
> >  [ 42.375920][ T393] kthread (kernel/kthread.c:465) 
> >  [ 42.376241][ T393] ret_from_fork (arch/x86/kernel/process.c:154) 
> >  [ 42.376603][ T393] ret_from_fork_asm (arch/x86/entry/entry_64.S:255) 
> >  [ 42.376973][ T393]
> >  [ 42.376973][ T393] -> #0 (rcu_tasks_trace_srcu_struct.srcu_wq.lock){....}-{2:2}:
> >  [ 42.377657][ T393] __lock_acquire (kernel/locking/lockdep.c:3166) 
> >  [ 42.378031][ T393] lock_acquire (kernel/locking/lockdep.c:5868) 
> >  [ 42.378378][ T393] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:110 kernel/locking/spinlock.c:162) 
> >  [ 42.378794][ T393] swake_up_one (kernel/sched/swait.c:52 (discriminator 1)) 
> >  [ 42.379152][ T393] tasks_tracing_torture_read_unlock (include/linux/srcu.h:408 (discriminator 1) include/linux/rcupdate_trace.h:81 (discriminator 1) kernel/rcu/rcutorture.c:1112 (discriminator 1)) rcutorture 
> >  [ 42.379714][ T393] rcutorture_one_extend (kernel/rcu/rcutorture.c:2141) rcutorture 
> >  [ 42.380217][ T393] rcu_torture_one_read_end (kernel/rcu/rcutorture.c:2357) rcutorture 
> >  [ 42.380731][ T393] rcu_torture_one_read (kernel/rcu/rcutorture.c:?) rcutorture 
> >  [ 42.381220][ T393] rcu_torture_reader (kernel/rcu/rcutorture.c:2443) rcutorture 
> >  [ 42.381714][ T393] kthread (kernel/kthread.c:465) 
> >  [ 42.382060][ T393] ret_from_fork (arch/x86/kernel/process.c:154) 
> >  [ 42.382420][ T393] ret_from_fork_asm (arch/x86/entry/entry_64.S:255) 
> >  [ 42.382796][ T393]
> >  [ 42.382796][ T393] other info that might help us debug this:
> >  [ 42.382796][ T393]
> >  [ 42.383515][ T393] Possible unsafe locking scenario:
> >  [ 42.383515][ T393]
> >  [ 42.384052][ T393] CPU0 CPU1
> >  [ 42.384428][ T393] ---- ----
> >  [ 42.384799][ T393] lock(&p->pi_lock);
> >  [ 42.385083][ T393] lock(rcu_tasks_trace_srcu_struct.srcu_wq.lock);
> >  [ 42.385707][ T393] lock(&p->pi_lock);
> >  [ 42.386180][ T393] lock(rcu_tasks_trace_srcu_struct.srcu_wq.lock);
> >  [ 42.386663][ T393]
> >  [ 42.386663][ T393] *** DEADLOCK ***
> >  [ 42.386663][ T393]
> >  [ 42.387236][ T393] 1 lock held by rcu_torture_rea/393:
> >  [ 42.387626][ T393] #0: ffff88813d1ff2e8 (&p->pi_lock){-.-.}-{2:2}, at: rcutorture_one_extend (kernel/rcu/rcutorture.c:?) rcutorture 
> >  [ 42.388419][ T393]
> >  [ 42.388419][ T393] stack backtrace:
> >  [ 42.388852][ T393] CPU: 0 UID: 0 PID: 393 Comm: rcu_torture_rea Tainted: G T 6.17.0-rc1-00035-g8bd938372706 #1 PREEMPT(full)
> >  [ 42.389758][ T393] Tainted: [T]=RANDSTRUCT
> >  [ 42.390057][ T393] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> >  [ 42.390786][ T393] Call Trace:
> >  [ 42.391020][ T393] <TASK>
> >  [ 42.391225][ T393] dump_stack_lvl (lib/dump_stack.c:123 (discriminator 2)) 
> >  [ 42.391544][ T393] print_circular_bug (kernel/locking/lockdep.c:2045) 
> >  [ 42.391898][ T393] check_noncircular (kernel/locking/lockdep.c:?) 
> >  [ 42.392242][ T393] __lock_acquire (kernel/locking/lockdep.c:3166) 
> >  [ 42.392594][ T393] ? __schedule (kernel/sched/sched.h:1531 (discriminator 1) kernel/sched/core.c:6969 (discriminator 1)) 
> >  [ 42.392930][ T393] ? lock_release (kernel/locking/lockdep.c:470 (discriminator 3)) 
> >  [ 42.393272][ T393] ? swake_up_one (kernel/sched/swait.c:52 (discriminator 1)) 
> >  [ 42.393610][ T393] lock_acquire (kernel/locking/lockdep.c:5868) 
> >  [ 42.393930][ T393] ? swake_up_one (kernel/sched/swait.c:52 (discriminator 1)) 
> >  [ 42.394264][ T393] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:110 kernel/locking/spinlock.c:162) 
> >  [ 42.394640][ T393] ? swake_up_one (kernel/sched/swait.c:52 (discriminator 1)) 
> >  [ 42.394969][ T393] swake_up_one (kernel/sched/swait.c:52 (discriminator 1)) 
> >  [ 42.395281][ T393] tasks_tracing_torture_read_unlock (include/linux/srcu.h:408 (discriminator 1) include/linux/rcupdate_trace.h:81 (discriminator 1) kernel/rcu/rcutorture.c:1112 (discriminator 1)) rcutorture 
> >  [ 42.395814][ T393] rcutorture_one_extend (kernel/rcu/rcutorture.c:2141) rcutorture 
> >  [ 42.396276][ T393] rcu_torture_one_read_end (kernel/rcu/rcutorture.c:2357) rcutorture 
> >  [ 42.396756][ T393] rcu_torture_one_read (kernel/rcu/rcutorture.c:?) rcutorture 
> >  [ 42.397219][ T393] ? __cfi_rcu_torture_reader (kernel/rcu/rcutorture.c:2426) rcutorture 
> >  [ 42.397690][ T393] rcu_torture_reader (kernel/rcu/rcutorture.c:2443) rcutorture 
> >  [ 42.398126][ T393] ? __cfi_rcu_torture_timer (kernel/rcu/rcutorture.c:2405) rcutorture 
> >  [ 42.398565][ T393] kthread (kernel/kthread.c:465) 
> >  [ 42.398857][ T393] ? __cfi_kthread (kernel/kthread.c:412) 
> >  [ 42.399169][ T393] ret_from_fork (arch/x86/kernel/process.c:154) 
> >  [ 42.399491][ T393] ? __cfi_kthread (kernel/kthread.c:412) 
> >  [ 42.399815][ T393] ret_from_fork_asm (arch/x86/entry/entry_64.S:255) 
> >  [ 42.400151][ T393] </TASK>
> >  
> >  
> >  The kernel config and materials to reproduce are available at:
> >  https://download.01.org/0day-ci/archive/20250826/202508261642.b15eefbb-lkp@intel.com
> >  
> >  
> >  
> >  -- 
> >  0-DAY CI Kernel Test Service
> >  https://github.com/intel/lkp-tests/wiki
> > 
> ------------------------------------------------------------------------
> 
> diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
> index 6e9fe2ce1075d5..db63378f062051 100644
> --- a/kernel/rcu/srcutiny.c
> +++ b/kernel/rcu/srcutiny.c
> @@ -106,7 +106,7 @@ void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
>  newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
>  WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval);
>  preempt_enable();
> - if (!newval && READ_ONCE(ssp->srcu_gp_waiting) && in_task())
> + if (!newval && READ_ONCE(ssp->srcu_gp_waiting) && in_task() && !irqs_disabled())


The fllowing case may exist:


CPU0

task1:
__srcu_read_lock()
....


task2 preempt run:

 srcu_drive_gp()
 ->swait_event_exclusive()


....
task1 continue run:
....
raw_spin_lock_irqsave
__srcu_read_unlock()
->find all previours condition are met
  but the irqs_disable() return true,
  not invoke swake_up_one().



task2 maybe always hung.

Thanks
Zqiang





>  swake_up_one(&ssp->srcu_wq);
>  }
>  EXPORT_SYMBOL_GPL(__srcu_read_unlock);
>

  reply	other threads:[~2025-08-30  2:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-26  8:47 [paulmckrcu:dev.2025.08.21a] [rcu] 8bd9383727: WARNING:possible_circular_locking_dependency_detected kernel test robot
2025-08-29 17:23 ` Paul E. McKenney
2025-08-30  2:38   ` Zqiang [this message]
2025-08-30 12:59     ` Paul E. McKenney
2025-08-31  2:22       ` Zqiang
2025-08-31 16:40         ` Paul E. McKenney
2025-08-31 23:52           ` Zqiang
2025-09-01  1:06             ` Paul E. McKenney
2025-09-01 13:22               ` Zqiang
2025-09-03  2:03   ` Oliver Sang
2025-09-03 10:42     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eb1e5ab00253fdae5ba5aa4c97d60a79d357dbfd@linux.dev \
    --to=qiang.zhang@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.