All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Jones <davej@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: paulmck@linux.vnet.ibm.com,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	fweisbec@gmail.com, Jiri Olsa <jolsa@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: rcu_read_lock() used illegally while idle!
Date: Tue, 28 May 2013 16:13:29 -0400	[thread overview]
Message-ID: <20130528201329.GA24342@redhat.com> (raw)
In-Reply-To: <1369506004.15552.5.camel@gandalf.local.home>

On Sat, May 25, 2013 at 02:20:04PM -0400, Steven Rostedt wrote:
 > On Sat, 2013-05-25 at 09:59 -0400, Dave Jones wrote:
 > > On Fri, May 24, 2013 at 03:23:40PM -0400, Steven Rostedt wrote:
 > >  > On Fri, 2013-05-24 at 10:23 -0400, Dave Jones wrote:
 > >  > 
 > >  > > .config: http://paste.fedoraproject.org/14281/94052971/raw/
 > >  > > 
 > >  > > trace shows the problem process was 'cc1', so I was likely building a kernel
 > >  > > at the time.  There was also a trinity run going on in the background.
 > >  > > 
 > >  > > cmdline: nothing special..
 > >  > > 
 > >  > > BOOT_IMAGE=/vmlinuz-3.10.0-rc2+ root=UUID=bee21cd9-1852-4d1d-9e9d-2e44332b8df1 ro rd.md=0 rd.lvm=0 rd.dm=0 vconsole.keymap=us rd.luks=0 vconsole.font=latarcyrheb-sun16 consoleblank=0 console=ttyUSB0,115200 console=tty0 pause_on_oops=30 audit=0
 > >  > > 
 > >  > 
 > >  > OK, I'm able to reproduce this. Looks like the same issue happens from
 > >  > other ways from userspace into the kernel (the next way was tracing
 > >  > system calls).
 > >  > 
 > >  > Forget the other patch. Here's a new patch that is more specific to
 > >  > tracing and context tracking.
 > >  > 
 > >  > Can you try this one out, please.
 > > 
 > > Took a lot longer to hit this..
 > 
 > This is a same but different bug ;-)
 > 
 > Looks like we fixed all the function tracing infrastructure problems,
 > but this is a function tracer user problem. Namely perf.

another variant of the same ? or different different ?

[12572.705832] ======================================================
[12572.750317] [ INFO: possible circular locking dependency detected ]
[12572.796978] 3.10.0-rc3+ #39 Not tainted
[12572.833381] -------------------------------------------------------
[12572.862233] trinity-child17/31341 is trying to acquire lock:
[12572.870390]  (rcu_node_0){..-.-.}, at: [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0
[12572.878859] 
but task is already holding lock:
[12572.894894]  (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0
[12572.903381] 
which lock already depends on the new lock.

[12572.927541] 
the existing dependency chain (in reverse order) is:
[12572.943736] 
-> #4 (&ctx->lock){-.-...}:
[12572.960032]        [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
[12572.968337]        [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80
[12572.976633]        [<ffffffff8113c987>] __perf_event_task_sched_out+0x2e7/0x5e0
[12572.984969]        [<ffffffff81088953>] perf_event_task_sched_out+0x93/0xa0
[12572.993326]        [<ffffffff816ea0bf>] __schedule+0x2cf/0x9c0
[12573.001652]        [<ffffffff816eacfe>] schedule_user+0x2e/0x70
[12573.009998]        [<ffffffff816ecd64>] retint_careful+0x12/0x2e
[12573.018321] 
-> #3 (&rq->lock){-.-.-.}:
[12573.034628]        [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
[12573.042930]        [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80
[12573.051248]        [<ffffffff8108e6a7>] wake_up_new_task+0xb7/0x260
[12573.059579]        [<ffffffff810492f5>] do_fork+0x105/0x470
[12573.067880]        [<ffffffff81049686>] kernel_thread+0x26/0x30
[12573.076202]        [<ffffffff816cee63>] rest_init+0x23/0x140
[12573.084508]        [<ffffffff81ed8e1f>] start_kernel+0x3f1/0x3fe
[12573.092852]        [<ffffffff81ed856f>] x86_64_start_reservations+0x2a/0x2c
[12573.101233]        [<ffffffff81ed863d>] x86_64_start_kernel+0xcc/0xcf
[12573.109528] 
-> #2 (&p->pi_lock){-.-.-.}:
[12573.125675]        [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
[12573.133829]        [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90
[12573.141964]        [<ffffffff8108e881>] try_to_wake_up+0x31/0x320
[12573.150065]        [<ffffffff8108ebe2>] default_wake_function+0x12/0x20
[12573.158151]        [<ffffffff8107bbf8>] autoremove_wake_function+0x18/0x40
[12573.166195]        [<ffffffff81085398>] __wake_up_common+0x58/0x90
[12573.174215]        [<ffffffff81086909>] __wake_up+0x39/0x50
[12573.182146]        [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50
[12573.190119]        [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0
[12573.198023]        [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930
[12573.205860]        [<ffffffff8107a91d>] kthread+0xed/0x100
[12573.213656]        [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0
[12573.221379] 
-> #1 (&rsp->gp_wq){..-.-.}:
[12573.236329]        [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
[12573.243783]        [<ffffffff816ebe9b>] _raw_spin_lock_irqsave+0x4b/0x90
[12573.251178]        [<ffffffff810868f3>] __wake_up+0x23/0x50
[12573.258505]        [<ffffffff810fc3da>] rcu_start_gp_advanced.isra.11+0x4a/0x50
[12573.265891]        [<ffffffff810fdb09>] rcu_start_future_gp+0x1c9/0x1f0
[12573.273248]        [<ffffffff810fe2c4>] rcu_nocb_kthread+0x114/0x930
[12573.280564]        [<ffffffff8107a91d>] kthread+0xed/0x100
[12573.287807]        [<ffffffff816f4b1c>] ret_from_fork+0x7c/0xb0
[12573.295067] 
-> #0 (rcu_node_0){..-.-.}:
[12573.309293]        [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0
[12573.316568]        [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
[12573.323825]        [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80
[12573.331081]        [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0
[12573.338377]        [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0
[12573.345648]        [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0
[12573.352942]        [<ffffffff8113938e>] find_get_context+0x4e/0x1f0
[12573.360211]        [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0
[12573.367514]        [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10
[12573.374816]        [<ffffffff816f4dd4>] tracesys+0xdd/0xe2
[12573.382068] 
other info that might help us debug this:

[12573.403229] Chain exists of:
  rcu_node_0 --> &rq->lock --> &ctx->lock

[12573.424471]  Possible unsafe locking scenario:

[12573.438499]        CPU0                    CPU1
[12573.445599]        ----                    ----
[12573.452691]   lock(&ctx->lock);
[12573.459799]                                lock(&rq->lock);
[12573.467010]                                lock(&ctx->lock);
[12573.474192]   lock(rcu_node_0);
[12573.481262] 
 *** DEADLOCK ***

[12573.501931] 1 lock held by trinity-child17/31341:
[12573.508990]  #0:  (&ctx->lock){-.-...}, at: [<ffffffff811390ed>] perf_lock_task_context+0x7d/0x2d0
[12573.516475] 
stack backtrace:
[12573.530395] CPU: 1 PID: 31341 Comm: trinity-child17 Not tainted 3.10.0-rc3+ #39
[12573.545357]  ffffffff825b4f90 ffff880219f1dbc0 ffffffff816e375b ffff880219f1dc00
[12573.552868]  ffffffff816dfa5d ffff880219f1dc50 ffff88023ce4d1f8 ffff88023ce4ca40
[12573.560353]  0000000000000001 0000000000000001 ffff88023ce4d1f8 ffff880219f1dcc0
[12573.567856] Call Trace:
[12573.575011]  [<ffffffff816e375b>] dump_stack+0x19/0x1b
[12573.582284]  [<ffffffff816dfa5d>] print_circular_bug+0x200/0x20f
[12573.589637]  [<ffffffff810b8d36>] __lock_acquire+0x1786/0x1af0
[12573.596982]  [<ffffffff810918f5>] ? sched_clock_cpu+0xb5/0x100
[12573.604344]  [<ffffffff810b9851>] lock_acquire+0x91/0x1f0
[12573.611652]  [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0
[12573.619030]  [<ffffffff816ebc90>] _raw_spin_lock+0x40/0x80
[12573.626331]  [<ffffffff811054ff>] ? rcu_read_unlock_special+0x9f/0x4c0
[12573.633671]  [<ffffffff811054ff>] rcu_read_unlock_special+0x9f/0x4c0
[12573.640992]  [<ffffffff811390ed>] ? perf_lock_task_context+0x7d/0x2d0
[12573.648330]  [<ffffffff810b429e>] ? put_lock_stats.isra.29+0xe/0x40
[12573.655662]  [<ffffffff813095a0>] ? delay_tsc+0x90/0xe0
[12573.662964]  [<ffffffff810760a6>] __rcu_read_unlock+0x96/0xa0
[12573.670276]  [<ffffffff811391b3>] perf_lock_task_context+0x143/0x2d0
[12573.677622]  [<ffffffff81139070>] ? __perf_event_enable+0x370/0x370
[12573.684981]  [<ffffffff8113938e>] find_get_context+0x4e/0x1f0
[12573.692358]  [<ffffffff811403f4>] SYSC_perf_event_open+0x514/0xbd0
[12573.699753]  [<ffffffff8108cd9d>] ? get_parent_ip+0xd/0x50
[12573.707135]  [<ffffffff810b71fd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[12573.714599]  [<ffffffff81140e49>] SyS_perf_event_open+0x9/0x10
[12573.721996]  [<ffffffff816f4dd4>] tracesys+0xdd/0xe2


  parent reply	other threads:[~2013-05-28 20:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-21 17:31 rcu_read_lock() used illegally while idle! Dave Jones
2013-05-21 23:58 ` WARNING: at kernel/rcutree.c:388 rcu_eqs_enter Dave Jones
2013-05-22  9:51   ` Paul E. McKenney
2013-05-23 15:51     ` Dave Jones
2013-05-26 18:17       ` Paul E. McKenney
2013-05-22  9:36 ` rcu_read_lock() used illegally while idle! Paul E. McKenney
2013-05-22 12:49   ` Steven Rostedt
2013-05-23 16:40     ` Dave Jones
2013-05-24 13:46       ` Steven Rostedt
2013-05-24 14:23         ` Dave Jones
2013-05-24 19:23           ` Steven Rostedt
2013-05-24 22:41             ` Dave Jones
2013-05-25 13:59             ` Dave Jones
2013-05-25 18:20               ` Steven Rostedt
2013-05-26 18:04                 ` Jiri Olsa
2013-05-28 20:13                 ` Dave Jones [this message]
2013-05-28 20:27                   ` Steven Rostedt
2013-05-28 21:32                   ` [BUG] with rcu nocb, don't call wake up holding rnp->lock (was: rcu_read_lock() used illegally while idle!) Steven Rostedt
2013-05-29  3:29                     ` Paul E. McKenney
2013-06-19 18:39             ` [tip:sched/core] tracing/context-tracking: Add preempt_schedule_context() for tracing tip-bot for Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130528201329.GA24342@redhat.com \
    --to=davej@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.