From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S264650AbUEOG3T (ORCPT ); Sat, 15 May 2004 02:29:19 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S264653AbUEOG3T (ORCPT ); Sat, 15 May 2004 02:29:19 -0400 Received: from fw.osdl.org ([65.172.181.6]:22919 "EHLO mail.osdl.org") by vger.kernel.org with ESMTP id S264650AbUEOG3R (ORCPT ); Sat, 15 May 2004 02:29:17 -0400 Date: Fri, 14 May 2004 23:28:42 -0700 From: Andrew Morton To: linux@horizon.com Cc: linux-kernel@vger.kernel.org Subject: Re: 2.6.6 is crashing repeatedly Message-Id: <20040514232842.63fd3240.akpm@osdl.org> In-Reply-To: <20040515062258.7048.qmail@science.horizon.com> References: <20040515062258.7048.qmail@science.horizon.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org linux@horizon.com wrote: > > I have now captured a kernel crash. Everything after iput in the second crash > was hand-coped, and may suffer from transcription errors, but it was done > quite carefully. > > System has ECC memory and has been very stable, with uptimes in excess of > 1 year when kernel upgrades were infrequent (2.5 development). > > Stock 2.6.6 kernel, config as posted before. > > Unable to handle kernel NULL pointer dereference at virtual address 00000004 > printing eip: > c012a392 > *pde = 00000000 > Oops: 0002 [#1] > CPU: 0 > EIP: 0060:[] Not tainted > EFLAGS: 00010012 (2.6.6) > EIP is at free_block+0x52/0xd0 > eax: 00000000 ebx: e9a3f000 ecx: e9a3f200 edx: df654000 > esi: f7f8a560 edi: 00000016 ebp: f7f8a56c esp: f7d89dec > ds: 007b es: 007b ss: 0068 > Process kswapd0 (pid: 8, threadinfo=f7d88000 task=f7d8eb50) > Stack: f7f8a57c 0000001b c17fd784 c17fd784 f7f8a560 dc3cdac0 0000001b c012a449 > f7fe73dc c17fd774 c17fd774 00000296 dc3cdac0 c037a304 c012a61a dc3cdb40 > f7d89e5c 0000003d c014f385 dc3cdb40 c014f5c3 dc19c0c8 dc19c0c0 00000080 > Call Trace: > [] cache_flusharray+0x39/0xc0 > [] kmem_cache_free+0x3a/0x50 > [] destroy_inode+0x35/0x40 > [] dispose_list+0x43/0x70 > [] prune_icache+0xae/0x1b0 > [] shrink_icache_memory+0x15/0x20 Drat, random memory corruption. Can you enable CONFIG_SLAB_DEBUG? And CONFIG_DEBUG_PAGEALLOC too, although beware that the latter is a bit costly in terms of CPU cycles and memory usage.