From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org
Subject: Re: 2.6.23-rc7-mm1
Date: Mon, 24 Sep 2007 21:20:58 +0200 [thread overview]
Message-ID: <20070924212058.76017e47@lappy> (raw)
In-Reply-To: <46F7EEF3.7080902@linux.vnet.ibm.com>
On Mon, 24 Sep 2007 22:38:03 +0530 Kamalesh Babulal
<kamalesh@linux.vnet.ibm.com> wrote:
> Peter Zijlstra wrote:
> > On Mon, 24 Sep 2007 09:44:48 -0700 Andrew Morton
> > <akpm@linux-foundation.org> wrote:
> >
> >> On Mon, 24 Sep 2007 18:43:33 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> >>
> >>> Hi Andrew,
> >>>
> >>> Kernel BUG over x86_64 (AMD Opteron(tm) Processor 844).
> >>>
> >>> Similar kernel Bug was reported for 2.6.23-rc2-mm1
> >>> at http://lkml.org/lkml/2007/8/10/20 and the
> >>> mm-dirty-balancing-for-tasks.patch was dropped from 2.6.23-rc2-mm2.
> >>> And the same patch is in this -mm version, suspect whether is it the
> >>> same patch triggering this Bug.
> >>>
> >>> BUG: soft lockup - CPU#0 stuck for 11s! [events/0:15]
> >>> CPU 0:
> >>> Modules linked in:
> >>> Pid: 15, comm: events/0 Tainted: G D 2.6.23-rc7-mm1-autokern1 #1
> >>> RIP: 0010:[<ffffffff8021be46>] [<ffffffff8021be46>] __smp_call_function_mask+0x9a/0xc4
> >>> RSP: 0000:ffff8100017add80 EFLAGS: 00000297
> >>> RAX: 00000000000000fc RBX: ffff8100017adde0 RCX: 0000000000000001
> >>> RDX: 00000000000008fc RSI: 00000000000000fc RDI: 000000000000000e
> >>> RBP: ffffc20002d11000 R08: ffff8100017ac000 R09: ffffffff80675e38
> >>> R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000f
> >>> R13: ffffffff8021bcfe R14: 0000000000000000 R15: 0000000000000001
> >>> FS: 0000000000000000(0000) GS:ffffffff8065a000(0000) knlGS:00000000556aa2a0
> >>> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> >>> CR2: ffffc20002d11008 CR3: 0000000000201000 CR4: 00000000000006e0
> >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>>
> >>> Call Trace:
> >>> Inexact backtrace:
> >>> [<ffffffff802157a4>] mcheck_check_cpu+0x0/0x31
> >>> [<ffffffff802157a4>] mcheck_check_cpu+0x0/0x31
> >>> [<ffffffff8021becf>] smp_call_function_mask+0x5f/0x72
> >>> [<ffffffff802157a4>] mcheck_check_cpu+0x0/0x31
> >>> [<ffffffff8021bf82>] smp_call_function+0x19/0x1b
> >>> [<ffffffff8023a773>] on_each_cpu+0x16/0x2b
> >>> [<ffffffff802158a2>] mcheck_timer+0x0/0x7c
> >>> [<ffffffff802158c0>] mcheck_timer+0x1e/0x7c
> >>> [<ffffffff802444b9>] run_workqueue+0x88/0x109
> >>> [<ffffffff8024453a>] worker_thread+0x0/0xf4
> >>> [<ffffffff80244623>] worker_thread+0xe9/0xf4
> >>> [<ffffffff8024841d>] autoremove_wake_function+0x0/0x37
> >>> [<ffffffff8024841d>] autoremove_wake_function+0x0/0x37
> >>> [<ffffffff80247e5c>] kthread+0x44/0x6d
> >>> [<ffffffff8020c5a8>] child_rip+0xa/0x12
> >>> [<ffffffff80247e18>] kthread+0x0/0x6d
> >>> [<ffffffff8020c59e>] child_rip+0x0/0x12
> >> hm, I thought we'd fixed the problems in that patchset. Peter, were
> >> you aware of this one?
> >
> > Nope, and the stacktrace is utterly puzzling.
> >
> > /me goes read the lkml.org link
> >
> > Kamalesh Babulal: do you still get:
> > BUG: spinlock bad magic on
> >
> > msgs?
> >
> > Because those I could reproduce using fsx, and I fixed all that.
> Hi Peter,
>
> I do not get BUG: spinlock bad magic messages any more, but the softlock message is
> thrown more than 30 time, while running the ltp runall.
It would be good to know what function on_each_cpu is executing, could
you try something like:
---
kernel/softirq.c | 5 +++++
kernel/softlockup.c | 7 +++++++
2 files changed, 12 insertions(+)
Index: linux-2.6/kernel/softirq.c
===================================================================
--- linux-2.6.orig/kernel/softirq.c
+++ linux-2.6/kernel/softirq.c
@@ -645,6 +645,8 @@ __init int spawn_ksoftirqd(void)
}
#ifdef CONFIG_SMP
+
+DEFINE_PER_CPU(void (*)(void *info), last_on_each_cpu);
/*
* Call a function on all processors
*/
@@ -653,6 +655,9 @@ int on_each_cpu(void (*func) (void *info
int ret = 0;
preempt_disable();
+
+ per_cpu(last_on_each_cpu, smp_processor_id()) = func;
+
ret = smp_call_function(func, info, retry, wait);
local_irq_disable();
func(info);
Index: linux-2.6/kernel/softlockup.c
===================================================================
--- linux-2.6.orig/kernel/softlockup.c
+++ linux-2.6/kernel/softlockup.c
@@ -15,6 +15,8 @@
#include <linux/notifier.h>
#include <linux/module.h>
#include <linux/kgdb.h>
+#include <linux/percpu.h>
+#include <linux/kallsyms.h>
#include <asm/irq_regs.h>
@@ -71,6 +73,8 @@ void touch_all_softlockup_watchdogs(void
}
EXPORT_SYMBOL(touch_all_softlockup_watchdogs);
+DECLARE_PER_CPU(void (*)(void *), last_on_each_cpu);
+
/*
* This callback runs from the timer interrupt, and checks
* whether the watchdog thread has hung or not:
@@ -122,6 +126,9 @@ void softlockup_tick(void)
printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %lus! [%s:%d]\n",
this_cpu, now - touch_timestamp,
current->comm, task_pid_nr(current));
+ printk(KERN_ERR " last_on_each_cpu: [<%p>] ",
+ per_cpu(last_on_each_cpu, this_cpu));
+ print_symbol("%s\n", (unsigned long)per_cpu(last_on_each_cpu, this_cpu));
if (regs)
show_regs(regs);
else
next prev parent reply other threads:[~2007-09-24 19:24 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-24 9:17 2.6.23-rc7-mm1 Andrew Morton
2007-09-24 10:07 ` 2.6.23-rc7-mm1 Kamalesh Babulal
2007-09-24 21:02 ` 2.6.23-rc7-mm1 Sam Ravnborg
2007-09-24 21:36 ` 2.6.23-rc7-mm1 Sam Ravnborg
2007-09-24 23:27 ` 2.6.23-rc7-mm1 Kamalesh Babulal
2007-09-24 10:35 ` 2.6.23-rc7-mm1 - 'touch' command causes Oops Valdis.Kletnieks
2007-09-24 11:08 ` Balbir Singh
2007-09-24 12:05 ` Christoph Hellwig
2007-09-24 12:58 ` Valdis.Kletnieks
2007-09-24 15:45 ` Dave Hansen
2007-09-24 16:08 ` Valdis.Kletnieks
2007-09-24 11:30 ` [-mm Patch] net/bluetooth/hidp/core.c: Make hidp_setup_input() return int WANG Cong
2007-09-24 22:18 ` Marcel Holtmann
2007-09-26 5:57 ` David Miller
2007-09-24 11:42 ` 2.6.23-rc7-mm1 Kamalesh Babulal
2007-09-24 12:32 ` 2.6.23-rc7-mm1 -- s390 compile failures Andy Whitcroft
2007-09-24 12:49 ` Cedric Le Goater
2007-09-24 12:33 ` 2.6.23-rc7-mm1 Jiri Slaby
2007-09-24 14:41 ` [linux-usb-devel] 2.6.23-rc7-mm1 Alan Stern
2007-09-24 18:45 ` Jiri Slaby
2007-09-24 19:06 ` Alan Stern
2007-09-24 19:18 ` Jiri Slaby
2007-09-24 19:41 ` Alan Stern
2007-09-30 8:26 ` Jiri Slaby
2007-09-24 12:35 ` 2.6.23-rc7-mm1 -- powerpc rtas panic Andy Whitcroft
2007-10-02 23:28 ` Linas Vepstas
2007-10-03 0:26 ` Tony Breeds
2007-10-03 0:30 ` Michael Ellerman
2007-10-03 1:19 ` Tony Breeds
2007-10-03 4:09 ` Michael Ellerman
2007-10-03 18:50 ` Linas Vepstas
2007-10-05 0:01 ` Nish Aravamudan
2007-10-05 16:03 ` Linas Vepstas
2007-10-08 3:47 ` Nish Aravamudan
2007-09-24 12:47 ` 2.6.23-rc7-mm1 Cedric Le Goater
2007-09-24 16:56 ` 2.6.23-rc7-mm1 Jens Axboe
2007-09-24 12:55 ` 2.6.23-rc7-mm1 Kamalesh Babulal
2007-09-24 13:10 ` 2.6.23-rc7-mm1 Pavel Emelyanov
2007-09-24 13:21 ` 2.6.23-rc7-mm1 Balbir Singh
2007-09-24 15:34 ` 2.6.23-rc7-mm1 Pavel Emelyanov
2007-09-24 16:10 ` 2.6.23-rc7-mm1 Balbir Singh
2007-09-24 13:00 ` 2.6.23-rc7-mm1 Cedric Le Goater
2007-09-24 13:10 ` 2.6.23-rc7-mm1 Cedric Le Goater
2007-09-24 13:29 ` 2.6.23-rc7-mm1 Vlad Yasevich
2007-09-24 16:58 ` 2.6.23-rc7-mm1 Jens Axboe
2007-09-24 16:57 ` 2.6.23-rc7-mm1 Jens Axboe
2007-09-24 13:13 ` 2.6.23-rc7-mm1 Kamalesh Babulal
2007-09-24 16:44 ` 2.6.23-rc7-mm1 Andrew Morton
2007-09-24 16:57 ` 2.6.23-rc7-mm1 Peter Zijlstra
2007-09-24 17:08 ` 2.6.23-rc7-mm1 Kamalesh Babulal
2007-09-24 19:20 ` Peter Zijlstra [this message]
2007-09-25 11:05 ` 2.6.23-rc7-mm1 Peter Zijlstra
2007-09-25 13:07 ` 2.6.23-rc7-mm1 Kamalesh Babulal
2007-09-24 13:17 ` [PATCH rc7-mm1] fix BUG at mm/swap.c:405! Hugh Dickins
2007-09-24 14:52 ` 2.6.23-rc7-mm1 Reuben Farrelly
2007-09-24 16:59 ` 2.6.23-rc7-mm1 Andrew Morton
2007-09-24 17:12 ` 2.6.23-rc7-mm1 J. Bruce Fields
2007-09-24 21:31 ` 2.6.23-rc7-mm1 Reuben Farrelly
2007-09-24 15:18 ` 2.6.23-rc7-mm1 ia64 build issue in efi.c Bob Picco
2007-09-24 19:07 ` 2.6.23-rc7-mm1 Torsten Kaiser
2007-09-24 19:34 ` 2.6.23-rc7-mm1 Andrew Morton
2007-09-24 20:25 ` 2.6.23-rc7-mm1 Thomas Gleixner
2007-09-25 7:32 ` 2.6.23-rc7-mm1 Torsten Kaiser
2007-09-25 7:44 ` 2.6.23-rc7-mm1 Thomas Gleixner
2007-09-24 19:41 ` 2.6.23-rc7-mm1 Kamalesh Babulal
2007-09-25 10:23 ` 2.6.23-rc7-mm1 Mel Gorman
2007-09-25 10:31 ` 2.6.23-rc7-mm1 Jens Axboe
2007-09-25 11:15 ` 2.6.23-rc7-mm1 Mel Gorman
2007-09-25 11:23 ` 2.6.23-rc7-mm1 Jens Axboe
2007-09-24 20:10 ` 2.6.23-rc7-mm1: build error with CONFIG_KEXEC=y and CONFIG_NOHIGHMEM=y Laurent Riffard
2007-09-24 23:11 ` Randy Dunlap
2007-09-24 22:20 ` 2.6.23-rc7-mm1 Kamalesh Babulal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070924212058.76017e47@lappy \
--to=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=kamalesh@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox