From: Dave Jones <davej@codemonkey.org.uk>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>,
paulmck@linux.vnet.ibm.com,
Linux Kernel <linux-kernel@vger.kernel.org>,
Josh Triplett <josh@joshtriplett.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: 4.2-rc5 rcu stalls.
Date: Thu, 6 Aug 2015 00:15:08 -0400 [thread overview]
Message-ID: <20150806041504.GA14220@codemonkey.org.uk> (raw)
In-Reply-To: <20150805123757.GA7051@lerouge>
On Wed, Aug 05, 2015 at 02:37:59PM +0200, Frederic Weisbecker wrote:
> On Tue, Aug 04, 2015 at 08:12:50PM -0400, Dave Jones wrote:
> > On Tue, Aug 04, 2015 at 12:54:35AM -0400, Sasha Levin wrote:
> > > On 08/03/2015 06:03 PM, Paul E. McKenney wrote:
> > > >> > Ugh, that doesn't revert cleanly. Got something handy ?
> > > > I do not, but perhaps either Sasha or Frederic do.
> > >
> > > I've attached a revert courtesy of Peter.
> >
> > Thanks. At first I thought this was doing the trick, but then I hit this again.
> >
> >
> > [23643.545873] INFO: rcu_preempt detected stalls on CPUs/tasks:
> > [23643.546031] Tasks blocked on level-0 rcu_node (CPUs 0-3): P31722
> > [23643.546173] (detected by 3, t=65002 jiffies, g=2256887, c=2256886, q=0)
> > [23643.546326] trinity-watchdo R running task 14336 31722 31721 0x00080000
> > [23643.546488] ffff8804fcfe7cc8 000000000000ded0 0000000000000002 ffff8804f58bb680
> > [23643.546661] ffff8800ce4951c0 ffff8804fcfe7cb8 ffff8804fcfe8000 ffff8804f6552608
> > [23643.546830] 0000000000000009 ffff8804fcfe7e88 0000000000000009 ffff8804fcfe7ce8
> > [23643.547001] Call Trace:
> > [23643.547058] [<ffffffff887fa2b2>] preempt_schedule_common+0x22/0x40
> > [23643.547201] [<ffffffff887fa2ef>] preempt_schedule+0x1f/0x30
> > [23643.547329] [<ffffffff88001058>] ___preempt_schedule+0x12/0x14
> > [23643.547465] [<ffffffff8808b76d>] ? do_send_sig_info+0x5d/0x80
> > [23643.547599] [<ffffffff887fff32>] ? _raw_spin_unlock_irqrestore+0x42/0x70
> > [23643.547753] [<ffffffff887fff50>] ? _raw_spin_unlock_irqrestore+0x60/0x70
> > [23643.547910] [<ffffffff8808b76d>] do_send_sig_info+0x5d/0x80
> > [23643.548039] [<ffffffff8808be62>] group_send_sig_info+0xb2/0x120
> > [23643.548175] [<ffffffff8808bdb5>] ? group_send_sig_info+0x5/0x120
> > [23643.548314] [<ffffffff880ea62f>] ? rcu_read_lock_held+0x4f/0x60
> > [23643.548451] [<ffffffff8808c05f>] kill_pid_info+0x7f/0x150
> > [23643.548576] [<ffffffff8808c000>] ? kill_pid_info+0x20/0x150
> > [23643.548705] [<ffffffff8808c244>] SYSC_kill+0xf4/0x2b0
> > [23643.548821] [<ffffffff8808c1ed>] ? SYSC_kill+0x9d/0x2b0
> > [23643.548942] [<ffffffff880d35cb>] ? trace_hardirqs_on_caller+0x14b/0x1e0
> > [23643.549097] [<ffffffff880d366d>] ? trace_hardirqs_on+0xd/0x10
> > [23643.549231] [<ffffffff88192f63>] ? context_tracking_user_exit+0x13/0x20
> > [23643.549387] [<ffffffff88012c47>] ? syscall_trace_enter_phase1+0xf7/0x150
> > [23643.549540] [<ffffffff88001017>] ? trace_hardirqs_on_thunk+0x17/0x19
> > [23643.549687] [<ffffffff8808e64e>] SyS_kill+0xe/0x10
> > [23643.549799] [<ffffffff88800997>] entry_SYSCALL_64_fastpath+0x12/0x6f
>
> If it still happens after Sasha's revert, which basically revert all the offending
> patches related to preempt lately, then the reason might be elsewhere.
>
> How hard was it to reproduce? I see 23000 secs in your dmesg logs which is around 6 hours.
Interestingly, it happened again, but only after 7 hours.
I've yet to trigger it in a shorter timeframe. Frustrating.
[28190.798758] INFO: rcu_preempt detected stalls on CPUs/tasks:
[28190.798914] Tasks blocked on level-0 rcu_node (CPUs 0-3): P32189
[28190.799054] (detected by 1, t=65002 jiffies, g=2137396, c=2137395, q=0)
[28190.799203] trinity-c224 R running task 13856 32189 31964 0x00080000
[28190.799362] ffff8804f2323da8 ffffffffa67fa4d1 ffff8804fe170000 ffff8804b66db680
[28190.799531] ffff8804fe170000 ffff8804f2323d98 0000000000000000 ffff8804f2324000
[28190.799699] 0000000000000002 0000000000000000 0000000000000000 ffff8804f2323dc8
[28190.799866] Call Trace:
[28190.799921] [<ffffffffa67fa4d1>] ? preempt_schedule_irq+0x41/0xa0
[28190.800058] [<ffffffffa67fa4d7>] preempt_schedule_irq+0x47/0xa0
[28190.800191] [<ffffffffa6801529>] retint_kernel+0x1b/0x2d
[28190.800312] [<ffffffffa60d6319>] ? lock_acquire+0xd9/0x260
[28190.800438] [<ffffffffa609d295>] ? __task_pid_nr_ns+0x5/0x190
[28190.800568] [<ffffffffa680153b>] ? retint_kernel+0x2d/0x2d
[28190.800691] [<ffffffffa609d2d2>] __task_pid_nr_ns+0x42/0x190
[28190.800820] [<ffffffffa609d295>] ? __task_pid_nr_ns+0x5/0x190
[28190.800950] [<ffffffffa6091f0b>] sys_gettid+0x1b/0x20
[28190.801064] [<ffffffffa6800997>] entry_SYSCALL_64_fastpath+0x12/0x6f
[28190.801208] trinity-c224 R running task 13856 32189 31964 0x00080000
[28190.801365] ffff8804f2323da8 ffffffffa67fa4d1 ffff8804fe170000 ffff8804b66db680
[28190.801533] ffff8804fe170000 ffff8804f2323d98 0000000000000000 ffff8804f2324000
[28190.801702] 0000000000000002 0000000000000000 0000000000000000 ffff8804f2323dc8
[28190.801870] Call Trace:
[28190.801923] [<ffffffffa67fa4d1>] ? preempt_schedule_irq+0x41/0xa0
[28190.802060] [<ffffffffa67fa4d7>] preempt_schedule_irq+0x47/0xa0
[28190.802193] [<ffffffffa6801529>] retint_kernel+0x1b/0x2d
[28190.802313] [<ffffffffa60d6319>] ? lock_acquire+0xd9/0x260
[28190.802436] [<ffffffffa609d295>] ? __task_pid_nr_ns+0x5/0x190
[28190.802565] [<ffffffffa680153b>] ? retint_kernel+0x2d/0x2d
[28190.802688] [<ffffffffa609d2d2>] __task_pid_nr_ns+0x42/0x190
[28190.802815] [<ffffffffa609d295>] ? __task_pid_nr_ns+0x5/0x190
[28190.802945] [<ffffffffa6091f0b>] sys_gettid+0x1b/0x20
[28190.803058] [<ffffffffa6800997>] entry_SYSCALL_64_fastpath+0x12/0x6f
[29929.492752] INFO: rcu_preempt detected stalls on CPUs/tasks:
[29929.492906] Tasks blocked on level-0 rcu_node (CPUs 0-3): P289
[29929.493039] (detected by 0, t=65002 jiffies, g=2141006, c=2141005, q=0)
[29929.493188] systemd-journal R running task 12464 289 1 0x00080000
[29929.493347] ffff8804ff2bbae8 ffffffffa67fa4d1 ffff880501f81b40 ffff880503d43680
[29929.493515] ffff880501f81b40 ffff8804ff2bbad8 0000000000000000 ffff8804ff2bc000
[29929.493683] ffff8800d3e9f118 ffff8800d3e9eb40 0000000000000056 ffff8804ff2bbb08
[29929.493853] Call Trace:
[29929.493909] [<ffffffffa67fa4d1>] ? preempt_schedule_irq+0x41/0xa0
[29929.494046] [<ffffffffa67fa4d7>] preempt_schedule_irq+0x47/0xa0
[29929.494181] [<ffffffffa6801529>] retint_kernel+0x1b/0x2d
[29929.494304] [<ffffffffa67f9929>] ? __schedule+0x439/0xb20
[29929.494430] [<ffffffffa6001058>] ? ___preempt_schedule+0x12/0x14
[29929.494568] [<ffffffffa6001058>] ? ___preempt_schedule+0x12/0x14
[29929.494709] [<ffffffffa66b8b11>] ? sock_def_readable+0x161/0x190
[29929.501118] [<ffffffffa60ed468>] ? rcu_is_watching+0x38/0x60
[29929.507566] [<ffffffffa60ed481>] ? rcu_is_watching+0x51/0x60
[29929.513987] [<ffffffffa66b8b11>] sock_def_readable+0x161/0x190
[29929.520344] [<ffffffffa66b89b5>] ? sock_def_readable+0x5/0x190
[29929.526678] [<ffffffffa67ffe85>] ? _raw_spin_unlock+0x35/0x60
[29929.532988] [<ffffffffa67986c9>] unix_dgram_sendmsg+0x4f9/0x570
[29929.539184] [<ffffffffa66b509b>] ___sys_sendmsg+0x30b/0x320
[29929.545270] [<ffffffffa60cfe7e>] ? put_lock_stats.isra.29+0xe/0x30
[29929.551331] [<ffffffffa638a137>] ? debug_smp_processor_id+0x17/0x20
[29929.557285] [<ffffffffa60cfe7e>] ? put_lock_stats.isra.29+0xe/0x30
[29929.563204] [<ffffffffa60ad681>] ? get_parent_ip+0x11/0x50
[29929.569047] [<ffffffffa60ad813>] ? preempt_count_sub+0xa3/0xf0
[29929.574796] [<ffffffffa621a626>] ? __fget_light+0x66/0x90
[29929.580555] [<ffffffffa6192d53>] ? context_tracking_exit+0x43/0x240
[29929.586253] [<ffffffffa66b5712>] __sys_sendmsg+0x42/0x80
[29929.591843] [<ffffffffa66b5762>] SyS_sendmsg+0x12/0x20
[29929.597385] [<ffffffffa6800997>] entry_SYSCALL_64_fastpath+0x12/0x6f
prev parent reply other threads:[~2015-08-06 4:15 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-03 21:08 4.2-rc5 rcu stalls Dave Jones
2015-08-03 21:37 ` Paul E. McKenney
2015-08-03 21:55 ` Dave Jones
2015-08-03 22:03 ` Paul E. McKenney
2015-08-04 4:54 ` Sasha Levin
2015-08-05 0:12 ` Dave Jones
2015-08-05 12:37 ` Frederic Weisbecker
2015-08-05 13:18 ` Dave Jones
2015-08-05 14:38 ` Frederic Weisbecker
2015-08-05 14:46 ` Dave Jones
2015-08-06 4:15 ` Dave Jones [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150806041504.GA14220@codemonkey.org.uk \
--to=davej@codemonkey.org.uk \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=sasha.levin@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).