From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752858AbXCYWE5 (ORCPT ); Sun, 25 Mar 2007 18:04:57 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752864AbXCYWE5 (ORCPT ); Sun, 25 Mar 2007 18:04:57 -0400 Received: from smtp.osdl.org ([65.172.181.24]:59997 "EHLO smtp.osdl.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752858AbXCYWEz (ORCPT ); Sun, 25 Mar 2007 18:04:55 -0400 Date: Sun, 25 Mar 2007 14:01:31 -0800 From: Andrew Morton To: "Torsten Kaiser" Cc: "Con Kolivas" , "Andy Whitcroft" , "William Lee Irwin III" , linux-kernel@vger.kernel.org, "Steve Fox" , "Martin J. Bligh" Subject: Re: debug rsdl 0.33 Message-Id: <20070325140131.ebc97e20.akpm@linux-foundation.org> In-Reply-To: <64bb37e0703251128q3f9db894u24c4638dcf97224a@mail.gmail.com> References: <20070319205623.299d0378.akpm@linux-foundation.org> <4603C7EC.6030906@shadowen.org> <200703240845.30484.kernel@kolivas.org> <200703241026.57143.kernel@kolivas.org> <64bb37e0703251128q3f9db894u24c4638dcf97224a@mail.gmail.com> X-Mailer: Sylpheed version 2.2.7 (GTK+ 2.8.17; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 25 Mar 2007 19:28:57 +0100 "Torsten Kaiser" wrote: > On 3/24/07, Con Kolivas wrote: > > kernel/sched.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 51 insertions(+) > > 2.6.21-rc4-mm1 also fails for me. > > I tried pure 2.6.21-rc4-mm1, +hotfixes, +hotfixes+rsdl33 and at last > also added above debug patch. > > The oops from with the debug-patch added: > [ 65.426126] Freeing unused kernel memory: 312k freed > (on the console the system is starting up, getting until "Letting udev > process events ...") > [ 66.665611] Unable to handle kernel NULL pointer dereference at > 0000000000000020 RIP: > [ 66.682030] [] __sched_text_start+0x4dc/0xa0e > [ 66.707402] PGD 0 > [ 66.713473] Oops: 0000 [1] SMP > [ 66.722968] last sysfs file: > devices/pci0000:00/0000:00:05.0/host2/target2:0:0/2:0:0:0/type > [ 66.747954] CPU 0 > [ 66.754025] Modules linked in: > [ 66.763209] Pid: 1200, comm: udevd Not tainted 2.6.21-rc4-mm1 #4 > [ 66.781162] RIP: 0010:[] [] > __sched_text_start+0x4dc/0xa0e > [ 66.807236] RSP: 0018:ffff81007d38fe78 EFLAGS: 00010082 > [ 66.823115] RAX: ffffffffffffffd0 RBX: 000000000000008c RCX: 000000000000058e > [ 66.844439] RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000000000 > [ 66.865767] RBP: ffff81007d38ff08 R08: 0000000000000064 R09: ffff810001014a58 > [ 66.887092] R10: 000000000000001c R11: 0000000000000246 R12: ffff810001013700 > [ 66.908418] R13: ffff810001014198 R14: 0000000000000001 R15: 0000000f859461fc > [ 66.929745] FS: 00002b67df90e6d0(0000) GS:ffffffff807aa000(0000) > knlGS:0000000000000000 > [ 66.953950] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 66.971126] CR2: 0000000000000020 CR3: 0000000000201000 CR4: 00000000000006e0 > [ 66.992451] Process udevd (pid: 1200, threadinfo ffff81007d38e000, > task ffff81007e354100) > [ 67.016915] Stack: 00000000000004b0 0000000000000000 > 0000000000000000 ffff81007e354100 > [ 67.041097] ffffffffffffffd0 ffff81007e354298 ffff81011d420680 > ffffffff802234b1 > [ 67.063407] 0000000000000001 0000000000000000 0000000000000000 > 0000000000000246 > [ 67.085149] Call Trace: > [ 67.093037] [] filp_close+0x71/0x90 > [ 67.108397] [] do_exit+0x7e7/0x800 > [ 67.123495] [] do_group_exit+0x82/0x90 > [ 67.139634] [] system_call+0x7e/0x83 > [ 67.155277] > [ 67.159739] > [ 67.159740] Code: 48 39 48 50 0f 84 8b 00 00 00 48 c7 40 40 00 00 00 00 8b 52 > [ 67.186877] RIP [] __sched_text_start+0x4dc/0xa0e > [ 67.205919] RSP > [ 67.216348] CR2: 0000000000000020 > [ 67.226260] Fixing recursive fault but reboot is needed! We've seen multiple reports of this. For some reason we've managed to confuse kallsyms too. > The system in x86_64, two 2218 on a MCP55 nvidia chipset. > > 2.6.21-rc3-mm1 works fine. > > (gdb) list *0xffffffff8026167c > 0xffffffff8026167c is in schedule (kernel/sched.c:3619). > 3614 /* > 3615 * When the task is chosen it is checked to see if its > quota has been > 3616 * added to this runqueue level which is only performed once per > 3617 * level per major rotation for each running task. > 3618 */ > 3619 if (next->rotation != rq->prio_rotation) { > 3620 /* Task has moved during major rotation */ > 3621 task_new_array(next, rq); > 3622 if (!entitled_slot(next->static_prio, idx)) > 3623 exchange_slot(next, rq); > > Ah, that helps, thanks.