From: Andrew Morton <akpm@linux-foundation.org>
To: Andi Kleen <andi@firstfloor.org>
Cc: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
linux-kernel@vger.kernel.org,
oprofile-list@lists.sourceforge.net, netdev@vger.kernel.org,
rusty@rustcorp.com.au
Subject: Re: [BUG 2.6.30-rc1] panic when loading oprofile
Date: Sat, 16 May 2009 16:20:17 -0700 [thread overview]
Message-ID: <20090516162017.2525511c.akpm@linux-foundation.org> (raw)
In-Reply-To: <87fxf6pmh5.fsf@basil.nowhere.org>
On Fri, 15 May 2009 14:41:26 +0200 Andi Kleen <andi@firstfloor.org> wrote:
> "Brandeburg, Jesse" <jesse.brandeburg@intel.com> writes:
>
> Hi Jesse,
>
> > when starting a profile run on the latest net-next kernel, I'm currently
> > trying to reproduce on 2.6.30-rc5 stock.
>
> Were you able to reproduce it?
>
> >
> > config available upon request, arch=x86_64, recent (F10 or newer) oprofile
> > userspace.
>
> it looks like two bugs: oprofile didn't catch a NMI that belongs to
> it (most likely) and the NMI watchdog referenced a NULL pointer
> while processing an NMI.
>
> Did you have the nmi watchdog enabled on the command line?
>
> >
> > BUG: unable to handle kernel NULL pointer dereference at (null)
> > IP: [<ffffffff8066080a>] nmi_watchdog_tick+0xa1/0x1d6
>
> I don't get the same code as you. But the oopsing instruction in your
> oops is
>
> 2b:* 44 0f a3 28 bt %r13d,(%rax) <-- trapping instruction
>
> with rax == 0 and I suspect it's one of the new cpu mask checks
> I would try reverting
>
> fcc5c4a2feea3886dc058498b28508b2731720d5
> 2f537a9f8e82f55c241b002c8cfbf34303b45ada
> fcef8576d8a64fc603e719c97d423f9f6d4e0e8b
>
> and see which one causes it. That would only fix the NMI watchdog bug
> of course.
>
> The oprofile not catching a event problem would be still open then.
> I think the checks for overflowed counters are not 100% perfect
> so that could happen. I have some patches in the works to use the new
> global status register on arch perfmon 2, with that the overflow
> check is somewhat more reliable. But that's more work.
>
Ping?
This is in Rafael's regression list but I suspect that it's a
linux-next-only thing?
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Andi Kleen <andi@firstfloor.org>
Cc: netdev@vger.kernel.org, rusty@rustcorp.com.au,
oprofile-list@lists.sourceforge.net,
linux-kernel@vger.kernel.org
Subject: Re: [BUG 2.6.30-rc1] panic when loading oprofile
Date: Sat, 16 May 2009 16:20:17 -0700 [thread overview]
Message-ID: <20090516162017.2525511c.akpm@linux-foundation.org> (raw)
In-Reply-To: <87fxf6pmh5.fsf@basil.nowhere.org>
On Fri, 15 May 2009 14:41:26 +0200 Andi Kleen <andi@firstfloor.org> wrote:
> "Brandeburg, Jesse" <jesse.brandeburg@intel.com> writes:
>
> Hi Jesse,
>
> > when starting a profile run on the latest net-next kernel, I'm currently
> > trying to reproduce on 2.6.30-rc5 stock.
>
> Were you able to reproduce it?
>
> >
> > config available upon request, arch=x86_64, recent (F10 or newer) oprofile
> > userspace.
>
> it looks like two bugs: oprofile didn't catch a NMI that belongs to
> it (most likely) and the NMI watchdog referenced a NULL pointer
> while processing an NMI.
>
> Did you have the nmi watchdog enabled on the command line?
>
> >
> > BUG: unable to handle kernel NULL pointer dereference at (null)
> > IP: [<ffffffff8066080a>] nmi_watchdog_tick+0xa1/0x1d6
>
> I don't get the same code as you. But the oopsing instruction in your
> oops is
>
> 2b:* 44 0f a3 28 bt %r13d,(%rax) <-- trapping instruction
>
> with rax == 0 and I suspect it's one of the new cpu mask checks
> I would try reverting
>
> fcc5c4a2feea3886dc058498b28508b2731720d5
> 2f537a9f8e82f55c241b002c8cfbf34303b45ada
> fcef8576d8a64fc603e719c97d423f9f6d4e0e8b
>
> and see which one causes it. That would only fix the NMI watchdog bug
> of course.
>
> The oprofile not catching a event problem would be still open then.
> I think the checks for overflowed counters are not 100% perfect
> so that could happen. I have some patches in the works to use the new
> global status register on arch perfmon 2, with that the overflow
> check is somewhat more reliable. But that's more work.
>
Ping?
This is in Rafael's regression list but I suspect that it's a
linux-next-only thing?
------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables
unlimited royalty-free distribution of the report engine
for externally facing server and web deployment.
http://p.sf.net/sfu/businessobjects
next prev parent reply other threads:[~2009-05-16 23:20 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <alpine.WNT.2.00.0905131327430.7052__41366.1100705669$1242253861$gmane$org@jbrandeb-desk1.amr.corp.intel.com>
2009-05-15 12:41 ` [BUG 2.6.30-rc1] panic when loading oprofile Andi Kleen
2009-05-16 23:20 ` Andrew Morton [this message]
2009-05-16 23:20 ` Andrew Morton
2009-05-17 8:38 ` Andi Kleen
2009-05-20 7:13 ` Brandeburg, Jesse
2009-05-20 11:27 ` Andi Kleen
2009-05-20 11:27 ` Andi Kleen
2009-05-13 22:30 Brandeburg, Jesse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090516162017.2525511c.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=jesse.brandeburg@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=oprofile-list@lists.sourceforge.net \
--cc=rusty@rustcorp.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.