From: Dave Jones <davej@codemonkey.org.uk>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Sasha Levin <sasha.levin@oracle.com>,
paulmck@linux.vnet.ibm.com,
Linux Kernel <linux-kernel@vger.kernel.org>,
Josh Triplett <josh@joshtriplett.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: 4.2-rc5 rcu stalls.
Date: Thu, 6 Aug 2015 00:15:08 -0400 [thread overview]
Message-ID: <20150806041504.GA14220@codemonkey.org.uk> (raw)
In-Reply-To: <20150805123757.GA7051@lerouge>
On Wed, Aug 05, 2015 at 02:37:59PM +0200, Frederic Weisbecker wrote:
> On Tue, Aug 04, 2015 at 08:12:50PM -0400, Dave Jones wrote:
> > On Tue, Aug 04, 2015 at 12:54:35AM -0400, Sasha Levin wrote:
> > > On 08/03/2015 06:03 PM, Paul E. McKenney wrote:
> > > >> > Ugh, that doesn't revert cleanly. Got something handy ?
> > > > I do not, but perhaps either Sasha or Frederic do.
> > >
> > > I've attached a revert courtesy of Peter.
> >
> > Thanks. At first I thought this was doing the trick, but then I hit this again.
> >
> >
> > [23643.545873] INFO: rcu_preempt detected stalls on CPUs/tasks:
> > [23643.546031] Tasks blocked on level-0 rcu_node (CPUs 0-3): P31722
> > [23643.546173] (detected by 3, t=65002 jiffies, g=2256887, c=2256886, q=0)
> > [23643.546326] trinity-watchdo R running task 14336 31722 31721 0x00080000
> > [23643.546488] ffff8804fcfe7cc8 000000000000ded0 0000000000000002 ffff8804f58bb680
> > [23643.546661] ffff8800ce4951c0 ffff8804fcfe7cb8 ffff8804fcfe8000 ffff8804f6552608
> > [23643.546830] 0000000000000009 ffff8804fcfe7e88 0000000000000009 ffff8804fcfe7ce8
> > [23643.547001] Call Trace:
> > [23643.547058] [<ffffffff887fa2b2>] preempt_schedule_common+0x22/0x40
> > [23643.547201] [<ffffffff887fa2ef>] preempt_schedule+0x1f/0x30
> > [23643.547329] [<ffffffff88001058>] ___preempt_schedule+0x12/0x14
> > [23643.547465] [<ffffffff8808b76d>] ? do_send_sig_info+0x5d/0x80
> > [23643.547599] [<ffffffff887fff32>] ? _raw_spin_unlock_irqrestore+0x42/0x70
> > [23643.547753] [<ffffffff887fff50>] ? _raw_spin_unlock_irqrestore+0x60/0x70
> > [23643.547910] [<ffffffff8808b76d>] do_send_sig_info+0x5d/0x80
> > [23643.548039] [<ffffffff8808be62>] group_send_sig_info+0xb2/0x120
> > [23643.548175] [<ffffffff8808bdb5>] ? group_send_sig_info+0x5/0x120
> > [23643.548314] [<ffffffff880ea62f>] ? rcu_read_lock_held+0x4f/0x60
> > [23643.548451] [<ffffffff8808c05f>] kill_pid_info+0x7f/0x150
> > [23643.548576] [<ffffffff8808c000>] ? kill_pid_info+0x20/0x150
> > [23643.548705] [<ffffffff8808c244>] SYSC_kill+0xf4/0x2b0
> > [23643.548821] [<ffffffff8808c1ed>] ? SYSC_kill+0x9d/0x2b0
> > [23643.548942] [<ffffffff880d35cb>] ? trace_hardirqs_on_caller+0x14b/0x1e0
> > [23643.549097] [<ffffffff880d366d>] ? trace_hardirqs_on+0xd/0x10
> > [23643.549231] [<ffffffff88192f63>] ? context_tracking_user_exit+0x13/0x20
> > [23643.549387] [<ffffffff88012c47>] ? syscall_trace_enter_phase1+0xf7/0x150
> > [23643.549540] [<ffffffff88001017>] ? trace_hardirqs_on_thunk+0x17/0x19
> > [23643.549687] [<ffffffff8808e64e>] SyS_kill+0xe/0x10
> > [23643.549799] [<ffffffff88800997>] entry_SYSCALL_64_fastpath+0x12/0x6f
>
> If it still happens after Sasha's revert, which basically revert all the offending
> patches related to preempt lately, then the reason might be elsewhere.
>
> How hard was it to reproduce? I see 23000 secs in your dmesg logs which is around 6 hours.
Interestingly, it happened again, but only after 7 hours.
I've yet to trigger it in a shorter timeframe. Frustrating.
[28190.798758] INFO: rcu_preempt detected stalls on CPUs/tasks:
[28190.798914] Tasks blocked on level-0 rcu_node (CPUs 0-3): P32189
[28190.799054] (detected by 1, t=65002 jiffies, g=2137396, c=2137395, q=0)
[28190.799203] trinity-c224 R running task 13856 32189 31964 0x00080000
[28190.799362] ffff8804f2323da8 ffffffffa67fa4d1 ffff8804fe170000 ffff8804b66db680
[28190.799531] ffff8804fe170000 ffff8804f2323d98 0000000000000000 ffff8804f2324000
[28190.799699] 0000000000000002 0000000000000000 0000000000000000 ffff8804f2323dc8
[28190.799866] Call Trace:
[28190.799921] [<ffffffffa67fa4d1>] ? preempt_schedule_irq+0x41/0xa0
[28190.800058] [<ffffffffa67fa4d7>] preempt_schedule_irq+0x47/0xa0
[28190.800191] [<ffffffffa6801529>] retint_kernel+0x1b/0x2d
[28190.800312] [<ffffffffa60d6319>] ? lock_acquire+0xd9/0x260
[28190.800438] [<ffffffffa609d295>] ? __task_pid_nr_ns+0x5/0x190
[28190.800568] [<ffffffffa680153b>] ? retint_kernel+0x2d/0x2d
[28190.800691] [<ffffffffa609d2d2>] __task_pid_nr_ns+0x42/0x190
[28190.800820] [<ffffffffa609d295>] ? __task_pid_nr_ns+0x5/0x190
[28190.800950] [<ffffffffa6091f0b>] sys_gettid+0x1b/0x20
[28190.801064] [<ffffffffa6800997>] entry_SYSCALL_64_fastpath+0x12/0x6f
[28190.801208] trinity-c224 R running task 13856 32189 31964 0x00080000
[28190.801365] ffff8804f2323da8 ffffffffa67fa4d1 ffff8804fe170000 ffff8804b66db680
[28190.801533] ffff8804fe170000 ffff8804f2323d98 0000000000000000 ffff8804f2324000
[28190.801702] 0000000000000002 0000000000000000 0000000000000000 ffff8804f2323dc8
[28190.801870] Call Trace:
[28190.801923] [<ffffffffa67fa4d1>] ? preempt_schedule_irq+0x41/0xa0
[28190.802060] [<ffffffffa67fa4d7>] preempt_schedule_irq+0x47/0xa0
[28190.802193] [<ffffffffa6801529>] retint_kernel+0x1b/0x2d
[28190.802313] [<ffffffffa60d6319>] ? lock_acquire+0xd9/0x260
[28190.802436] [<ffffffffa609d295>] ? __task_pid_nr_ns+0x5/0x190
[28190.802565] [<ffffffffa680153b>] ? retint_kernel+0x2d/0x2d
[28190.802688] [<ffffffffa609d2d2>] __task_pid_nr_ns+0x42/0x190
[28190.802815] [<ffffffffa609d295>] ? __task_pid_nr_ns+0x5/0x190
[28190.802945] [<ffffffffa6091f0b>] sys_gettid+0x1b/0x20
[28190.803058] [<ffffffffa6800997>] entry_SYSCALL_64_fastpath+0x12/0x6f
[29929.492752] INFO: rcu_preempt detected stalls on CPUs/tasks:
[29929.492906] Tasks blocked on level-0 rcu_node (CPUs 0-3): P289
[29929.493039] (detected by 0, t=65002 jiffies, g=2141006, c=2141005, q=0)
[29929.493188] systemd-journal R running task 12464 289 1 0x00080000
[29929.493347] ffff8804ff2bbae8 ffffffffa67fa4d1 ffff880501f81b40 ffff880503d43680
[29929.493515] ffff880501f81b40 ffff8804ff2bbad8 0000000000000000 ffff8804ff2bc000
[29929.493683] ffff8800d3e9f118 ffff8800d3e9eb40 0000000000000056 ffff8804ff2bbb08
[29929.493853] Call Trace:
[29929.493909] [<ffffffffa67fa4d1>] ? preempt_schedule_irq+0x41/0xa0
[29929.494046] [<ffffffffa67fa4d7>] preempt_schedule_irq+0x47/0xa0
[29929.494181] [<ffffffffa6801529>] retint_kernel+0x1b/0x2d
[29929.494304] [<ffffffffa67f9929>] ? __schedule+0x439/0xb20
[29929.494430] [<ffffffffa6001058>] ? ___preempt_schedule+0x12/0x14
[29929.494568] [<ffffffffa6001058>] ? ___preempt_schedule+0x12/0x14
[29929.494709] [<ffffffffa66b8b11>] ? sock_def_readable+0x161/0x190
[29929.501118] [<ffffffffa60ed468>] ? rcu_is_watching+0x38/0x60
[29929.507566] [<ffffffffa60ed481>] ? rcu_is_watching+0x51/0x60
[29929.513987] [<ffffffffa66b8b11>] sock_def_readable+0x161/0x190
[29929.520344] [<ffffffffa66b89b5>] ? sock_def_readable+0x5/0x190
[29929.526678] [<ffffffffa67ffe85>] ? _raw_spin_unlock+0x35/0x60
[29929.532988] [<ffffffffa67986c9>] unix_dgram_sendmsg+0x4f9/0x570
[29929.539184] [<ffffffffa66b509b>] ___sys_sendmsg+0x30b/0x320
[29929.545270] [<ffffffffa60cfe7e>] ? put_lock_stats.isra.29+0xe/0x30
[29929.551331] [<ffffffffa638a137>] ? debug_smp_processor_id+0x17/0x20
[29929.557285] [<ffffffffa60cfe7e>] ? put_lock_stats.isra.29+0xe/0x30
[29929.563204] [<ffffffffa60ad681>] ? get_parent_ip+0x11/0x50
[29929.569047] [<ffffffffa60ad813>] ? preempt_count_sub+0xa3/0xf0
[29929.574796] [<ffffffffa621a626>] ? __fget_light+0x66/0x90
[29929.580555] [<ffffffffa6192d53>] ? context_tracking_exit+0x43/0x240
[29929.586253] [<ffffffffa66b5712>] __sys_sendmsg+0x42/0x80
[29929.591843] [<ffffffffa66b5762>] SyS_sendmsg+0x12/0x20
[29929.597385] [<ffffffffa6800997>] entry_SYSCALL_64_fastpath+0x12/0x6f
prev parent reply other threads:[~2015-08-06 4:15 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-03 21:08 4.2-rc5 rcu stalls Dave Jones
2015-08-03 21:37 ` Paul E. McKenney
2015-08-03 21:55 ` Dave Jones
2015-08-03 22:03 ` Paul E. McKenney
2015-08-04 4:54 ` Sasha Levin
2015-08-05 0:12 ` Dave Jones
2015-08-05 12:37 ` Frederic Weisbecker
2015-08-05 13:18 ` Dave Jones
2015-08-05 14:38 ` Frederic Weisbecker
2015-08-05 14:46 ` Dave Jones
2015-08-06 4:15 ` Dave Jones [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150806041504.GA14220@codemonkey.org.uk \
--to=davej@codemonkey.org.uk \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=sasha.levin@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.