From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner+willy=40w.ods.org-S932099AbWDRRLT@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932099AbWDRRLT (ORCPT <rfc822;willy@w.ods.org>);
	Tue, 18 Apr 2006 13:11:19 -0400
Received: (majordomo@vger.kernel.org) by vger.kernel.org id S932114AbWDRRLT
	(ORCPT <rfc822;linux-kernel-outgoing>);
	Tue, 18 Apr 2006 13:11:19 -0400
Received: from cantor2.suse.de ([195.135.220.15]:4060 "EHLO mx2.suse.de")
	by vger.kernel.org with ESMTP id S932099AbWDRRLS (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 18 Apr 2006 13:11:18 -0400
From: Andi Kleen <ak@suse.de>
To: Martin Bligh <mbligh@mbligh.org>
Subject: Re: 2.6.17-rc1-mm3 dies in LTP on amd64
Date: Tue, 18 Apr 2006 19:11:12 +0200
User-Agent: KMail/1.9.1
Cc: Andrew Morton <akpm@osdl.org>, linux-kernel <linux-kernel@vger.kernel.org>,
       jbeulich@novell.com
References: <44451AD5.9070709@mbligh.org>
In-Reply-To: <44451AD5.9070709@mbligh.org>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200604181911.13012.ak@suse.de>
Sender: linux-kernel-owner@vger.kernel.org
X-Mailing-List: linux-kernel@vger.kernel.org

On Tuesday 18 April 2006 18:59, Martin Bligh wrote:
> Runs most tests just fine, but not LTP.
> -mm2 ran LTP fine.

I don't think it's my patchkit - currently only has harmless things.

> Full log here:
> http://test.kernel.org/abat/28728/debug/console.log
> 
> The trainwreck starts with:
> 
> Modules linked in:
> Pid: 228, comm: kswapd0 Not tainted 2.6.17-rc1-mm3-autokern1 #1
> RIP: 0010:[<ffffffff8047a8dc>] <ffffffff8047a8dc>{__sched_text_start+1852}
> RSP: 0000:0000000000000000  EFLAGS: 00010046
> RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff805d9338
> RDX: ffff8100010c5090 RSI: ffffffff805d9338 RDI: ffff8100010c5090
> RBP: ffffffff805d9338 R08: 0000000000000010 R09: ffff8100e3e63d28
> R10: ffff8100e3e63a88 R11: 000000000000000b R12: ffff810000011280
> R13: ffff81007e186f40 R14: ffff810008003620 R15: 000002b9f81aa1c4
> FS:  0000000000000000(0000) GS:ffffffff805fa000(0000) knlGS:00000000f7ea7460
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: fffffffffffffff8 CR3: 000000007174a000 CR4: 00000000000006e0
> Process kswapd0 (pid: 228, threadinfo ffff8100e3e62000, task 
> ffff8100010c5090)
> Stack: ffffffff80578e20 ffff8100010c5090 0000000000000001 ffffffff80578f58
>         0000000000000000 ffffffff80578e78 ffffffff8020b082 ffffffff80578f58
>         0000000000000000 ffffffff80483520
> Call Trace: <#DF> <ffffffff8020b082>{show_registers+140}
>         <ffffffff8020b30b>{__die+159} <ffffffff8020b380>{die+50}
>         <ffffffff8020bb46>{do_double_fault+115} 
> <ffffffff8020aa61>{double_fault+125}
>         <ffffffff8047a8dc>{__sched_text_start+1852} <EOE>


Not very useful.  Something double faulted, but it's not on the stack
[I wonder if the stack walker over double faults is broken. Jan - did you
ever test that after you redid the walker?]

If you can reproduce it on a Intel machine it might be possible to find
it using the last branch registers (patch for that available on request).

Otherwise binary search I guess.

-Andi