All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Martin J. Bligh" <mbligh@google.com>
To: Andrew Morton <akpm@osdl.org>
Cc: linuxppc64-dev@ozlabs.org, Andi Kleen <ak@suse.de>,
	linux-kernel@vger.kernel.org
Subject: Re: 2.6.17-rc2-mm1
Date: Mon, 01 May 2006 07:24:30 -0700	[thread overview]
Message-ID: <44561A1E.7000103@google.com> (raw)
In-Reply-To: <20060428012022.7b73c77b.akpm@osdl.org>

Andrew Morton wrote:
> (I did s/linux-kernel@google.com/linux-kernel@vger.kernel.org/)
> 
> Martin Bligh <mbligh@google.com> wrote:
> 
>>Still crashes in LTP on x86_64:
>>(introduced in previous release)
>>
>>http://test.kernel.org/abat/29674/debug/console.log
> 
> 
> What a mess.  A doublefault inside an NMI watchdog timeout.  I think.  It's
> hard to see.  Some CPUs are stuck on a CPU scheduler lock, others seem to
> be stuck in flush_tlb_others.  One of these could be a consequence of the
> other, or both could be a consequence of something else.

OK, well the latest one seems cleaner, on -rc3-mm1.
http://test.kernel.org/abat/30007/debug/console.log

Just has the double fault, with no NMI watchdog timeouts. Not that
it means any more to me, but still ;-) mtest01 seems to be able to
reproduce this every time, but I don't have an appropriate box here
to diagnose it with (this was a 4x Opteron inside IBM), and it's
definitely something in -mm that's not in mainline.

M.

double fault: 0000 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:06.0/resource
CPU 0
Modules linked in:
Pid: 20519, comm: mtest01 Not tainted 2.6.17-rc3-mm1-autokern1 #1
RIP: 0010:[<ffffffff8047c8b8>] <ffffffff8047c8b8>{__sched_text_start+1856}
RSP: 0000:0000000000000000  EFLAGS: 00010082
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff805d9438
RDX: ffff8100db12c0d0 RSI: ffffffff805d9438 RDI: ffff8100db12c0d0
RBP: ffffffff805d9438 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8100e39bd440 R14: ffff810008003620 R15: 000002b02751726c
FS:  0000000000000000(0000) GS:ffffffff805fa000(0063) knlGS:00000000f7dd0460
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: fffffffffffffff8 CR3: 00000000da399000 CR4: 00000000000006e0
Process mtest01 (pid: 20519, threadinfo ffff8100b1bb4000, task 
ffff8100db12c0d0)
Stack: ffffffff80579e20 ffff8100db12c0d0 0000000000000001 ffffffff80579f58
        0000000000000000 ffffffff80579e78 ffffffff8020b0b2 ffffffff80579f58
        0000000000000000 ffffffff80485520
Call Trace: <#DF> <ffffffff8020b0b2>{show_registers+140}
        <ffffffff8020b357>{__die+159} <ffffffff8020b3cc>{die+50}
        <ffffffff8020bba6>{do_double_fault+115} 
<ffffffff8020aa91>{double_fault+125}
        <ffffffff8047c8b8>{__sched_text_start+1856} <EOE>

Code: e8 4c ba d8 ff 65 48 8b 34 25 00 00 00 00 4c 8b 46 08 f0 41
RIP <ffffffff8047c8b8>{__sched_text_start+1856} RSP <0000000000000000>
  -- 0:conmux-control -- time-stamp -- May/01/06  3:54:37 --

WARNING: multiple messages have this Message-ID (diff)
From: "Martin J. Bligh" <mbligh@google.com>
To: Andrew Morton <akpm@osdl.org>
Cc: apw@shadowen.org, linuxppc64-dev@ozlabs.org,
	linux-kernel@vger.kernel.org, Andi Kleen <ak@suse.de>
Subject: Re: 2.6.17-rc2-mm1
Date: Mon, 01 May 2006 07:24:30 -0700	[thread overview]
Message-ID: <44561A1E.7000103@google.com> (raw)
In-Reply-To: <20060428012022.7b73c77b.akpm@osdl.org>

Andrew Morton wrote:
> (I did s/linux-kernel@google.com/linux-kernel@vger.kernel.org/)
> 
> Martin Bligh <mbligh@google.com> wrote:
> 
>>Still crashes in LTP on x86_64:
>>(introduced in previous release)
>>
>>http://test.kernel.org/abat/29674/debug/console.log
> 
> 
> What a mess.  A doublefault inside an NMI watchdog timeout.  I think.  It's
> hard to see.  Some CPUs are stuck on a CPU scheduler lock, others seem to
> be stuck in flush_tlb_others.  One of these could be a consequence of the
> other, or both could be a consequence of something else.

OK, well the latest one seems cleaner, on -rc3-mm1.
http://test.kernel.org/abat/30007/debug/console.log

Just has the double fault, with no NMI watchdog timeouts. Not that
it means any more to me, but still ;-) mtest01 seems to be able to
reproduce this every time, but I don't have an appropriate box here
to diagnose it with (this was a 4x Opteron inside IBM), and it's
definitely something in -mm that's not in mainline.

M.

double fault: 0000 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:06.0/resource
CPU 0
Modules linked in:
Pid: 20519, comm: mtest01 Not tainted 2.6.17-rc3-mm1-autokern1 #1
RIP: 0010:[<ffffffff8047c8b8>] <ffffffff8047c8b8>{__sched_text_start+1856}
RSP: 0000:0000000000000000  EFLAGS: 00010082
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff805d9438
RDX: ffff8100db12c0d0 RSI: ffffffff805d9438 RDI: ffff8100db12c0d0
RBP: ffffffff805d9438 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8100e39bd440 R14: ffff810008003620 R15: 000002b02751726c
FS:  0000000000000000(0000) GS:ffffffff805fa000(0063) knlGS:00000000f7dd0460
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: fffffffffffffff8 CR3: 00000000da399000 CR4: 00000000000006e0
Process mtest01 (pid: 20519, threadinfo ffff8100b1bb4000, task 
ffff8100db12c0d0)
Stack: ffffffff80579e20 ffff8100db12c0d0 0000000000000001 ffffffff80579f58
        0000000000000000 ffffffff80579e78 ffffffff8020b0b2 ffffffff80579f58
        0000000000000000 ffffffff80485520
Call Trace: <#DF> <ffffffff8020b0b2>{show_registers+140}
        <ffffffff8020b357>{__die+159} <ffffffff8020b3cc>{die+50}
        <ffffffff8020bba6>{do_double_fault+115} 
<ffffffff8020aa91>{double_fault+125}
        <ffffffff8047c8b8>{__sched_text_start+1856} <EOE>

Code: e8 4c ba d8 ff 65 48 8b 34 25 00 00 00 00 4c 8b 46 08 f0 41
RIP <ffffffff8047c8b8>{__sched_text_start+1856} RSP <0000000000000000>
  -- 0:conmux-control -- time-stamp -- May/01/06  3:54:37 --

  reply	other threads:[~2006-05-01 14:24 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-27 16:47 2.6.17-rc2-mm1 Martin Bligh
2006-04-28  8:20 ` 2.6.17-rc2-mm1 Andrew Morton
2006-04-28  8:20   ` 2.6.17-rc2-mm1 Andrew Morton
2006-05-01 14:24   ` Martin J. Bligh [this message]
2006-05-01 14:24     ` 2.6.17-rc2-mm1 Martin J. Bligh
2006-05-01 17:07     ` 2.6.17-rc2-mm1 Andrew Morton
2006-05-01 17:07       ` 2.6.17-rc2-mm1 Andrew Morton
2006-05-01 17:14       ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-01 17:14         ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-01 17:19       ` 2.6.17-rc2-mm1 Badari Pulavarty
2006-05-01 17:19         ` 2.6.17-rc2-mm1 Badari Pulavarty
2006-05-01 17:26         ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-01 17:26           ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-01 17:55           ` 2.6.17-rc2-mm1 Badari Pulavarty
2006-05-01 17:55             ` 2.6.17-rc2-mm1 Badari Pulavarty
2006-05-01 17:57             ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-01 17:57               ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-01 18:32               ` 2.6.17-rc2-mm1 Andy Whitcroft
2006-05-01 18:32                 ` 2.6.17-rc2-mm1 Andy Whitcroft
2006-05-01 23:29                 ` 2.6.17-rc2-mm1 Badari Pulavarty
2006-05-01 23:29                   ` 2.6.17-rc2-mm1 Badari Pulavarty
2006-05-01 17:32       ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-02 20:20         ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-01 18:34     ` 2.6.17-rc2-mm1 Andi Kleen
2006-05-01 18:34       ` 2.6.17-rc2-mm1 Andi Kleen
2006-05-02 13:20       ` 2.6.17-rc2-mm1 Andy Whitcroft
2006-05-02 13:20         ` 2.6.17-rc2-mm1 Andy Whitcroft
2006-05-02 20:00       ` 2.6.17-rc2-mm1 Martin Bligh
2006-05-02 20:09         ` 2.6.17-rc2-mm1 Andi Kleen
2006-05-03  6:47           ` 2.6.17-rc2-mm1 Jan Beulich
2006-05-03  6:49             ` 2.6.17-rc2-mm1 Andi Kleen
2006-05-03  7:08               ` 2.6.17-rc2-mm1 Jan Beulich
2006-05-03  7:38                 ` 2.6.17-rc2-mm1 Andi Kleen
2006-05-03  8:12                   ` 2.6.17-rc2-mm1 Andy Whitcroft
2006-05-03  8:25                     ` 2.6.17-rc2-mm1 Jan Beulich
2006-05-03 19:26               ` 2.6.17-rc2-mm1 Andy Whitcroft
2006-05-04  7:40                 ` 2.6.17-rc2-mm1 Andy Whitcroft
2006-05-04 16:28                 ` 2.6.17-rc2-mm1 Andy Whitcroft
  -- strict thread matches above, loose matches on Subject: below --
2006-05-04  6:22 2.6.17-rc2-mm1 Chuck Ebbert
2006-05-03  5:37 2.6.17-rc2-mm1 Chuck Ebbert
2006-04-27 16:54 2.6.17-rc2-mm1 Martin Bligh
2006-04-27 16:54 ` 2.6.17-rc2-mm1 Martin Bligh
2006-04-27 16:50 2.6.17-rc2-mm1 Martin Bligh
2006-04-27  8:41 2.6.17-rc2-mm1 Andrew Morton
2006-04-27 10:16 ` 2.6.17-rc2-mm1 Andi Kleen
2006-04-27 19:19   ` 2.6.17-rc2-mm1 Andrew Morton
2006-04-27 19:26     ` 2.6.17-rc2-mm1 Andi Kleen
2006-04-27 21:41     ` 2.6.17-rc2-mm1 Grant Coady
2006-04-27 21:50       ` 2.6.17-rc2-mm1 Randy.Dunlap
2006-04-27 22:16         ` 2.6.17-rc2-mm1 Andrew Morton
2006-04-27 10:27 ` 2.6.17-rc2-mm1 Michal Piotrowski
2006-04-27 13:07   ` 2.6.17-rc2-mm1 Michal Piotrowski
2006-04-27 15:28     ` 2.6.17-rc2-mm1 Greg KH
2006-04-27 15:32       ` 2.6.17-rc2-mm1 Michal Piotrowski
2006-04-27 20:53         ` 2.6.17-rc2-mm1 Greg KH
2006-04-27 22:09           ` 2.6.17-rc2-mm1 Michal Piotrowski
2006-04-27 15:26   ` 2.6.17-rc2-mm1 Greg KH
2006-04-27 15:43     ` 2.6.17-rc2-mm1 Michal Piotrowski
2006-04-27 15:47 ` 2.6.17-rc2-mm1 Matthieu CASTET
2006-04-27 18:02   ` 2.6.17-rc2-mm1 Vivek Goyal
2006-04-27 23:24     ` 2.6.17-rc2-mm1 Greg KH
2006-04-28 14:40       ` 2.6.17-rc2-mm1 Vivek Goyal
2006-04-28 16:07     ` 2.6.17-rc2-mm1 matthieu castet
2006-04-28 18:05       ` 2.6.17-rc2-mm1 Vivek Goyal
2006-04-27  8:41 2.6.17-rc2-mm1 Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44561A1E.7000103@google.com \
    --to=mbligh@google.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc64-dev@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.