From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Knut Petersen" <Knut_Petersen@t-online.de>,
"Ingo Molnar" <mingo@kernel.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Greg KH" <greg@kroah.com>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown
Date: Mon, 14 Oct 2013 14:28:30 -0700 [thread overview]
Message-ID: <20131014212830.GD5790@linux.vnet.ibm.com> (raw)
In-Reply-To: <CA+55aFwN++wO=rTFaH7m6YYQX+Fv3qDt6Hxs7UnPUJpFsrwSkA@mail.gmail.com>
On Mon, Oct 14, 2013 at 10:53:03AM -0700, Linus Torvalds wrote:
> Hmm. No obvious ideas come to mind, but I'm adding more people to the cc.
>
> Clearly the wait_event_interruptible_timeout() in the RCU grace-period
> thread causes this, but I'm not seeing why shutdown would trigger it.
>
> The code disassembles to
>
> 0: 85 db test %ebx,%ebx
> 2: 79 0c jns 0x10
> 4: 81 e6 ff 00 00 00 and $0xff,%esi
> a: 8d 44 f0 30 lea 0x30(%eax,%esi,8),%eax
> e: eb 0a jmp 0x1a
> 10: c1 e9 1a shr $0x1a,%ecx
> 13: 8d 84 c8 30 0e 00 00 lea 0xe30(%eax,%ecx,8),%eax
> 1a: 8b 48 04 mov 0x4(%eax),%ecx
> 1d: 89 50 04 mov %edx,0x4(%eax)
> 20: 89 02 mov %eax,(%edx)
> 22: 89 4a 04 mov %ecx,0x4(%edx)
> 25:* 89 11 mov %edx,(%ecx) <-- trapping instruction
> 27: 5b pop %ebx
> 28: 5e pop %esi
> 29: 5d pop %ebp
> 2a: c3 ret
>
> so the oops is in the final
>
> list_add_tail(&timer->entry, vec);
>
> where "%ecx" is "vec->prev" (f8c551f4). That looks like it might be a
> perfectly valid pointer, but clearly it isn't (it's about 115M off the
> top of virtual memory, I think that might be in the vmalloc area).
>
> So I'm *guessing* that something did a vfree() on some data structure
> that contained active timers - and then later on the RCU thread ended
> up being the next thing that tried to add a timer after the
> now-non-existing one.
>
> And your other oopses do seem to have a similar pattern, even if their
> actual oops is elsewhere. They oops in run_timer_softirq, also taking
> a page fault in the 0xf9...... range, so it might well be a vmalloc
> address there too.
>
> But I sure as hell can't start to guess what that would be.
>
> I'm wondering it CONFIG_DEBUG_OBJECTS (and then
> CONFIG_DEBUG_OBJECTS_FREE=y and CONFIG_DEBUG_OBJECTS_TIMERS=y) might
> help catch this...
I would also like to nominate CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, which
checks for invoking call_rcu() twice in a row on the same rcu_head.
Any chance of a look at the .config file?
Thanx, Paul
> Linus
>
> On Mon, Oct 14, 2013 at 4:07 AM, Knut Petersen
> <Knut_Petersen@t-online.de> wrote:
> >
> > It愀 the third time in four months that I have to report a kernel Oops during
> > shutdown.
> > All of these Oopses seem somehow related to the timer subsystem, but they
> > are
> > not easily reproducible. As all this happens on two different machines, it愀
> > unlikely
> > that this mess is related to bad hardware.
> >
> > I clearly would appreciate any idea how to track this down.
> >
> > For the last two reports see:
> >
> > http://www.gossamer-threads.com/lists/linux/kernel/1782575?#1782575
> >
> > http://www.gossamer-threads.com/lists/linux/kernel/1744892?#1744892
> >
> > This time the kernel oopsed after systemd reported that target shutdown
> > had been reached - see attached pdf for the full trace. To make it easier
> > to find this problem a shortened call trace:
> >
> >
> > Call Trace:
> > internal_add_timer
> > schedule_timeout
> > ? call_timer_fn
> > rcu_gp_kthread
> > __init_waitqueue_head
> > ? rcu_gp_fqs
> > kthread
> > ret_from_kernel_thread
> > ? __init_kthread_worker
> >
> > EIP: __internal_add_timer
> >
> > Hardware: AOpen i915GMm-hfs mobo with a Pentium-M Dothan and 2GB of RAM.
> > Distribution: openSuSE 12.3
> > Kernel: local 3.12.0-rc4-00127-g45877c4 is kernel 9d05746 with my
> > "Enforce 1 as lower limit for perf_event_max_sample_rate"
> > patch applied.
> >
> > cu,
> > knut
>
next prev parent reply other threads:[~2013-10-14 21:28 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <525BD08C.2080101@t-online.de>
2013-10-14 17:53 ` [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown Linus Torvalds
2013-10-14 21:28 ` Paul E. McKenney [this message]
2013-10-14 21:51 ` Frederic Weisbecker
2013-10-14 22:31 ` Knut Petersen
2013-10-14 22:43 ` Frederic Weisbecker
2013-10-15 6:40 ` Ingo Molnar
2013-10-15 7:53 ` Knut Petersen
2013-10-17 14:25 ` Frederic Weisbecker
2013-10-18 6:30 ` Ingo Molnar
2013-10-14 21:52 ` Knut Petersen
2013-10-14 23:16 ` Paul E. McKenney
2013-10-15 0:59 ` Paul E. McKenney
2013-10-15 8:06 ` Knut Petersen
2013-10-25 8:38 ` Linus Torvalds
2013-10-25 9:02 ` Linus Torvalds
2013-10-25 9:08 ` Paul E. McKenney
2013-10-25 9:17 ` Greg Kroah-Hartman
2013-10-25 9:13 ` Greg Kroah-Hartman
2013-10-25 9:28 ` Rafael J. Wysocki
2013-10-25 9:51 ` Rafael J. Wysocki
2013-10-25 9:54 ` Viresh Kumar
2013-10-25 10:10 ` Rafael J. Wysocki
2013-10-25 10:00 ` Viresh Kumar
2013-10-25 10:07 ` Linus Torvalds
2013-10-25 11:10 ` Rafael J. Wysocki
2013-10-25 13:49 ` Viresh Kumar
2013-10-25 14:21 ` Rafael J. Wysocki
2013-10-28 15:02 ` Knut Petersen
2013-10-25 10:23 ` Thomas Gleixner
2013-10-25 10:48 ` Linus Torvalds
2013-10-26 11:43 ` Ingo Molnar
2013-10-28 14:50 ` Knut Petersen
2013-10-28 15:01 ` Ingo Molnar
2013-10-28 15:16 ` Ingo Molnar
2013-10-28 15:45 ` Knut Petersen
2013-10-27 20:20 ` Linus Torvalds
2013-10-27 20:39 ` Linus Torvalds
2013-10-27 21:13 ` Linus Torvalds
2013-10-27 21:24 ` Greg Kroah-Hartman
2013-10-28 17:23 ` Bjorn Helgaas
2013-10-28 17:30 ` Veaceslav Falico
2013-10-28 17:35 ` Bjorn Helgaas
2013-10-28 17:39 ` Veaceslav Falico
2013-10-28 18:52 ` Greg Kroah-Hartman
2013-10-30 18:04 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131014212830.GD5790@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=Knut_Petersen@t-online.de \
--cc=fweisbec@gmail.com \
--cc=greg@kroah.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).