From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>,
Michael Wang <wangyun@linux.vnet.ibm.com>,
Dave Jones <davej@redhat.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: RCU idle CPU detection is broken in linux-next
Date: Tue, 25 Sep 2012 06:04:43 -0700 [thread overview]
Message-ID: <20120925130443.GF2436@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120925115917.GB2310@somewhere.redhat.com>
On Tue, Sep 25, 2012 at 01:59:26PM +0200, Frederic Weisbecker wrote:
> On Mon, Sep 24, 2012 at 09:04:20PM -0700, Paul E. McKenney wrote:
> > On Tue, Sep 25, 2012 at 01:41:18AM +0200, Frederic Weisbecker wrote:
> > > >
> > > > [ 168.703017] ------------[ cut here ]------------
> > > > [ 168.708117] WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common+0x4a/0x3a0()
> > > > [ 168.710034] Pid: 7871, comm: trinity-child65 Tainted: G W
> > > > 3.6.0-rc6-next-20120924-sasha-00030-g71f256c #5
> > > > [ 168.710034] Call Trace:
> > > > [ 168.710034] <IRQ> [<ffffffff811c737a>] ? rcu_eqs_exit_common+0x4a/0x3a0
> > > > [ 168.710034] [<ffffffff811078b6>] warn_slowpath_common+0x86/0xb0
> > > > [ 168.710034] [<ffffffff811079a5>] warn_slowpath_null+0x15/0x20
> > > > [ 168.710034] [<ffffffff811c737a>] rcu_eqs_exit_common+0x4a/0x3a0
> > > > [ 168.710034] [<ffffffff811c79cc>] rcu_eqs_exit+0x9c/0xb0
> > > > [ 168.710034] [<ffffffff811c7a4c>] rcu_user_exit+0x6c/0xd0
> > > > [ 168.710034] [<ffffffff8106eb1f>] do_general_protection+0x1f/0x170
> > > > [ 168.710034] [<ffffffff83a0e624>] ? restore_args+0x30/0x30
> > > > [ 168.710034] [<ffffffff83a0e875>] general_protection+0x25/0x30
> > > > [ 168.710034] [<ffffffff810a3f06>] ? native_read_msr_safe+0x6/0x20
> > > > [ 168.710034] [<ffffffff81a0b34b>] __rdmsr_safe_on_cpu+0x2b/0x50
> > > > [ 168.710034] [<ffffffff819ec971>] ? list_del+0x11/0x40
> > > > [ 168.710034] [<ffffffff811886dc>]
> > > > generic_smp_call_function_single_interrupt+0xec/0x120
> > > > [ 168.710034] [<ffffffff81151147>] ? account_system_vtime+0xd7/0x140
> > > > [ 168.710034] [<ffffffff81096f72>]
> > > > smp_call_function_single_interrupt+0x22/0x40
> > > > [ 168.710034] [<ffffffff83a0fe2f>] call_function_single_interrupt+0x6f/0x80
> > > > [ 168.710034] <EOI> [<ffffffff83a0e5f4>] ? retint_restore_args+0x13/0x13
> > > > [ 168.710034] [<ffffffff811c7285>] ? rcu_user_enter+0x105/0x110
> > > > [ 168.710034] [<ffffffff8107e06d>] syscall_trace_leave+0xfd/0x150
> > > > [ 168.710034] [<ffffffff83a0f1ef>] int_check_syscall_exit_work+0x34/0x3d
> > > > [ 168.710034] ---[ end trace fd408dd21b70b87c ]---
> > > >
> > > > This is an exception inside an interrupt, and the interrupt
> > > > interrupted RCU user mode.
> > > > And we have that nesting:
> > > >
> > > > rcu_irq_enter(); <--- irq entry
> > > > rcu_user_exit(); <--- exception entry
> > > >
> > > > And rcu_eqs_exit() doesn't handle that very well...
> > >
> > > So either I should return immediately from rcu_user_exit() if
> > > we are in an interrupt, or we make rcu_user_exit() able to nest
> > > on rcu_irq_enter() :)
> >
> > Both of the two are eminently doable, with varying degrees of hackery.
> >
> > What makes the most sense from an adaptive-idle viewpoint?
>
> Given that we have:
>
> rcu_irq_enter()
> rcu_user_exit()
> rcu_user_enter()
> rcu_irq_exit()
Indeed, the code to deal with irq misnestings won't like that at all.
And we are in the kernel between rcu_user_exit() and rcu_user_enter()
(right?), so we could in fact see irq misnestings.
> And we already have rcu_user_exit_after_irq(), this starts to be confusing
> if we allow that nesting. Although if we find a solution that, in the end,
> merge rcu_user_exit() with rcu_user_exit_after_irq() and same for the enter version,
> this would probably be a good thing. Provided this doesn't involve some more
> complicated rdtp->dyntick_nesting trickies nor more overhead.
>
> Otherwise we could avoid to call rcu_user_* when we are in an irq. When we'll have
> the user_hooks layer, we can perhaps manage that from that place. For
> now may be we can return after in_interrupt() in the rcu user apis.
This last sounds best.
My main concern is irq misnesting. We might need to do something ugly
like record the interrupt nesting level at rcu_user_exit() and restore
it at rcu_user_enter(). Sigh!!!
> Let's first ensure I diagnosed it well and we don't have other problems detected
> by Sasha. I'm cooking a testing patch.
Excellent point!
Thanx, Paul
next prev parent reply other threads:[~2012-09-25 14:10 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-12 17:56 RCU idle CPU detection is broken in linux-next Sasha Levin
2012-09-19 15:39 ` Paul E. McKenney
2012-09-19 16:35 ` Sasha Levin
2012-09-19 17:06 ` Paul E. McKenney
2012-09-19 22:27 ` Sasha Levin
2012-09-20 7:33 ` Michael Wang
2012-09-20 7:44 ` Sasha Levin
2012-09-20 8:14 ` Michael Wang
2012-09-20 15:23 ` Paul E. McKenney
2012-09-21 9:30 ` Sasha Levin
2012-09-21 12:13 ` Paul E. McKenney
2012-09-21 13:26 ` Sasha Levin
2012-09-21 15:12 ` Paul E. McKenney
2012-09-21 15:18 ` Sasha Levin
2012-09-22 8:26 ` Sasha Levin
2012-09-22 15:09 ` Paul E. McKenney
2012-09-22 15:20 ` Paul E. McKenney
2012-09-22 15:40 ` Sasha Levin
2012-09-22 15:56 ` Paul E. McKenney
2012-09-22 17:50 ` Sasha Levin
2012-09-22 21:27 ` Paul E. McKenney
2012-09-23 0:21 ` Paul E. McKenney
2012-09-23 5:39 ` Sasha Levin
2012-09-24 21:29 ` Frederic Weisbecker
2012-09-24 22:47 ` Sasha Levin
2012-09-24 22:54 ` Sasha Levin
2012-09-24 23:06 ` Frederic Weisbecker
2012-09-24 23:10 ` Sasha Levin
2012-09-24 23:35 ` Frederic Weisbecker
2012-09-24 23:41 ` Frederic Weisbecker
2012-09-25 4:04 ` Paul E. McKenney
2012-09-25 11:59 ` Frederic Weisbecker
2012-09-25 13:04 ` Paul E. McKenney [this message]
2012-09-26 14:56 ` Frederic Weisbecker
2012-09-26 16:26 ` Paul E. McKenney
2012-09-25 12:06 ` Frederic Weisbecker
2012-09-25 18:28 ` Sasha Levin
2012-09-25 18:36 ` Paul E. McKenney
2012-09-26 15:46 ` Frederic Weisbecker
2012-09-26 16:59 ` Paul E. McKenney
2012-09-26 14:58 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120925130443.GF2436@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=davej@redhat.com \
--cc=fweisbec@gmail.com \
--cc=levinsasha928@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=wangyun@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.