From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S265063AbUEYTBu (ORCPT ); Tue, 25 May 2004 15:01:50 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S265050AbUEYTBk (ORCPT ); Tue, 25 May 2004 15:01:40 -0400 Received: from parcelfarce.linux.theplanet.co.uk ([195.92.249.252]:49891 "EHLO www.linux.org.uk") by vger.kernel.org with ESMTP id S265049AbUEYTBa (ORCPT ); Tue, 25 May 2004 15:01:30 -0400 Date: Tue, 25 May 2004 16:02:51 -0300 From: Marcelo Tosatti To: Doug Dumitru Cc: linux-kernel@vger.kernel.org, cramerj@intel.com, john.ronciak@intel.com, Ganesh.Venkatesan@intel.com, jgarzik@pobox.com Subject: Re: Hard Hang with __alloc_pages: 0-order allocation failed (gfp=0x20/1) - Not out of memory Message-ID: <20040525190251.GB4377@logos.cnet> References: <40B16736.6090303@easyco.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <40B16736.6090303@easyco.com> User-Agent: Mutt/1.5.5.1i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 23, 2004 at 08:08:38PM -0700, Doug Dumitru wrote: > I pulled some more information (if I did it correctly) from the first stack > dump from the first __alloc_pages error log. > > ksymoops 2.4.4 on i686 2.4.25. Options used > -V (default) > -k ksyms.5 (specified) > -l /proc/modules (default) > -o /lib/modules/2.4.26/ (specified) > -m /boot/System.map-2.4.26 (specified) > > Warning (expand_objects): object > /lib/modules/2.4.26/kernel/drivers/md/lvm-mod.o for module lvm-mod has > changed since load > Warning (expand_objects): object /lib/modules/2.4.26/kernel/drivers/md/md.o > for module md has changed since load > cc68bad8 c0135289 00000000 011410ac 00000001 0000000c c03689dc 0000 > cbccb780 cbccb780 c02d23ba c7c5b838 > Call Trace: [] [] [] [] > [] > [] [] [] [] [] > [] > [] [] [] [] [] > [] > [] [] [] [] [] > [] > [] [] [] [] > Warning (Oops_read): Code line not seen, dumping what data is available > > Trace; c0135289 <__alloc_pages+2d9/2f0> > Trace; c01352b0 <__get_free_pages+10/20> > Trace; c0132214 > Trace; c02d23ba > Trace; c01327f1 > Trace; c029923f > Trace; c01f0d3c > Trace; c01f0c52 > Trace; c0121786 > Trace; c01219d9 > Trace; c01f05ec > Trace; c010a4de > Trace; c010a6f4 > Trace; c0133ce6 > Trace; c0134152 > Trace; c01341fc > Trace; c0134271 > Trace; c0134dff > Trace; c0135169 <__alloc_pages+1b9/2f0> > Trace; c01352b0 <__get_free_pages+10/20> > Trace; c014c203 <__pollwait+33/90> > Trace; c02b765e > Trace; c029634f > Trace; c014c467 > Trace; c014c8e9 > Trace; c010a72d > Trace; c0108b63 > > > If I am reading this correctly, the system was ... > > in an interrupt > processing some TCP select(...) stuff > asking for a page > doing a zone rebalance > trying to shrink cache > and interrupted again > by the ethernet driver > which wanted to allocate an skb > which wanted a page > > Thus __alloc_pages appears to be called recursively, with the 2nd call > during a rebalance in the first one and both calls non-interuptable (on > interrupts). Is this allowable? Nope, this is not allowed. It seems we are calling alloc_skb(GFP_KERNEL) from inside an interrupt handler. Oops. e1000 maintainers, can you look at this please?