From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Knut Petersen" <Knut_Petersen@t-online.de>,
"Ingo Molnar" <mingo@kernel.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Greg KH" <greg@kroah.com>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown
Date: Mon, 14 Oct 2013 14:28:30 -0700 [thread overview]
Message-ID: <20131014212830.GD5790@linux.vnet.ibm.com> (raw)
In-Reply-To: <CA+55aFwN++wO=rTFaH7m6YYQX+Fv3qDt6Hxs7UnPUJpFsrwSkA@mail.gmail.com>
On Mon, Oct 14, 2013 at 10:53:03AM -0700, Linus Torvalds wrote:
> Hmm. No obvious ideas come to mind, but I'm adding more people to the cc.
>
> Clearly the wait_event_interruptible_timeout() in the RCU grace-period
> thread causes this, but I'm not seeing why shutdown would trigger it.
>
> The code disassembles to
>
> 0: 85 db test %ebx,%ebx
> 2: 79 0c jns 0x10
> 4: 81 e6 ff 00 00 00 and $0xff,%esi
> a: 8d 44 f0 30 lea 0x30(%eax,%esi,8),%eax
> e: eb 0a jmp 0x1a
> 10: c1 e9 1a shr $0x1a,%ecx
> 13: 8d 84 c8 30 0e 00 00 lea 0xe30(%eax,%ecx,8),%eax
> 1a: 8b 48 04 mov 0x4(%eax),%ecx
> 1d: 89 50 04 mov %edx,0x4(%eax)
> 20: 89 02 mov %eax,(%edx)
> 22: 89 4a 04 mov %ecx,0x4(%edx)
> 25:* 89 11 mov %edx,(%ecx) <-- trapping instruction
> 27: 5b pop %ebx
> 28: 5e pop %esi
> 29: 5d pop %ebp
> 2a: c3 ret
>
> so the oops is in the final
>
> list_add_tail(&timer->entry, vec);
>
> where "%ecx" is "vec->prev" (f8c551f4). That looks like it might be a
> perfectly valid pointer, but clearly it isn't (it's about 115M off the
> top of virtual memory, I think that might be in the vmalloc area).
>
> So I'm *guessing* that something did a vfree() on some data structure
> that contained active timers - and then later on the RCU thread ended
> up being the next thing that tried to add a timer after the
> now-non-existing one.
>
> And your other oopses do seem to have a similar pattern, even if their
> actual oops is elsewhere. They oops in run_timer_softirq, also taking
> a page fault in the 0xf9...... range, so it might well be a vmalloc
> address there too.
>
> But I sure as hell can't start to guess what that would be.
>
> I'm wondering it CONFIG_DEBUG_OBJECTS (and then
> CONFIG_DEBUG_OBJECTS_FREE=y and CONFIG_DEBUG_OBJECTS_TIMERS=y) might
> help catch this...
I would also like to nominate CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, which
checks for invoking call_rcu() twice in a row on the same rcu_head.
Any chance of a look at the .config file?
Thanx, Paul
> Linus
>
> On Mon, Oct 14, 2013 at 4:07 AM, Knut Petersen
> <Knut_Petersen@t-online.de> wrote:
> >
> > It愀 the third time in four months that I have to report a kernel Oops during
> > shutdown.
> > All of these Oopses seem somehow related to the timer subsystem, but they
> > are
> > not easily reproducible. As all this happens on two different machines, it愀
> > unlikely
> > that this mess is related to bad hardware.
> >
> > I clearly would appreciate any idea how to track this down.
> >
> > For the last two reports see:
> >
> > http://www.gossamer-threads.com/lists/linux/kernel/1782575?#1782575
> >
> > http://www.gossamer-threads.com/lists/linux/kernel/1744892?#1744892
> >
> > This time the kernel oopsed after systemd reported that target shutdown
> > had been reached - see attached pdf for the full trace. To make it easier
> > to find this problem a shortened call trace:
> >
> >
> > Call Trace:
> > internal_add_timer
> > schedule_timeout
> > ? call_timer_fn
> > rcu_gp_kthread
> > __init_waitqueue_head
> > ? rcu_gp_fqs
> > kthread
> > ret_from_kernel_thread
> > ? __init_kthread_worker
> >
> > EIP: __internal_add_timer
> >
> > Hardware: AOpen i915GMm-hfs mobo with a Pentium-M Dothan and 2GB of RAM.
> > Distribution: openSuSE 12.3
> > Kernel: local 3.12.0-rc4-00127-g45877c4 is kernel 9d05746 with my
> > "Enforce 1 as lower limit for perf_event_max_sample_rate"
> > patch applied.
> >
> > cu,
> > knut
>
next prev parent reply other threads:[~2013-10-14 21:28 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <525BD08C.2080101@t-online.de>
2013-10-14 17:53 ` [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown Linus Torvalds
2013-10-14 21:28 ` Paul E. McKenney [this message]
2013-10-14 21:51 ` Frederic Weisbecker
2013-10-14 22:31 ` Knut Petersen
2013-10-14 22:43 ` Frederic Weisbecker
2013-10-15 6:40 ` Ingo Molnar
2013-10-15 7:53 ` Knut Petersen
2013-10-17 14:25 ` Frederic Weisbecker
2013-10-18 6:30 ` Ingo Molnar
2013-10-14 21:52 ` Knut Petersen
2013-10-14 23:16 ` Paul E. McKenney
2013-10-15 0:59 ` Paul E. McKenney
2013-10-15 8:06 ` Knut Petersen
2013-10-25 8:38 ` Linus Torvalds
2013-10-25 9:02 ` Linus Torvalds
2013-10-25 9:08 ` Paul E. McKenney
2013-10-25 9:17 ` Greg Kroah-Hartman
2013-10-25 9:13 ` Greg Kroah-Hartman
2013-10-25 9:28 ` Rafael J. Wysocki
2013-10-25 9:28 ` Rafael J. Wysocki
2013-10-25 9:51 ` Rafael J. Wysocki
2013-10-25 9:54 ` Viresh Kumar
2013-10-25 10:10 ` Rafael J. Wysocki
2013-10-25 10:00 ` Viresh Kumar
2013-10-25 10:07 ` Linus Torvalds
2013-10-25 11:10 ` Rafael J. Wysocki
2013-10-25 13:49 ` Viresh Kumar
2013-10-25 14:21 ` Rafael J. Wysocki
2013-10-28 15:02 ` Knut Petersen
2013-10-25 10:23 ` Thomas Gleixner
2013-10-25 10:48 ` Linus Torvalds
2013-10-26 11:43 ` Ingo Molnar
2013-10-28 14:50 ` Knut Petersen
2013-10-28 15:01 ` Ingo Molnar
2013-10-28 15:16 ` Ingo Molnar
2013-10-28 15:45 ` Knut Petersen
2013-10-27 20:20 ` Linus Torvalds
2013-10-27 20:39 ` Linus Torvalds
2013-10-27 21:13 ` Linus Torvalds
2013-10-27 21:24 ` Greg Kroah-Hartman
2013-10-28 17:23 ` Bjorn Helgaas
2013-10-28 17:30 ` Veaceslav Falico
2013-10-28 17:35 ` Bjorn Helgaas
2013-10-28 17:39 ` Veaceslav Falico
2013-10-28 18:52 ` Greg Kroah-Hartman
2013-10-30 18:04 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131014212830.GD5790@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=Knut_Petersen@t-online.de \
--cc=fweisbec@gmail.com \
--cc=greg@kroah.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.