From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S264481AbTLaNZ1 (ORCPT ); Wed, 31 Dec 2003 08:25:27 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S264484AbTLaNZ1 (ORCPT ); Wed, 31 Dec 2003 08:25:27 -0500 Received: from e3.ny.us.ibm.com ([32.97.182.103]:56218 "EHLO e3.ny.us.ibm.com") by vger.kernel.org with ESMTP id S264481AbTLaNZT (ORCPT ); Wed, 31 Dec 2003 08:25:19 -0500 Date: Wed, 31 Dec 2003 18:59:59 +0530 From: Srivatsa Vaddagiri To: linux-kernel@vger.kernel.org Cc: manfred@colorfullife.com, rusty@au1.ibm.com Subject: BUG in x86 do_page_fault? [was Re: in_atomic doesn't count local_irq_disable?] Message-ID: <20031231185959.A9041@in.ibm.com> Reply-To: vatsa@in.ibm.com References: <3FF044A2.3050503@colorfullife.com> <20031230185615.A9292@in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20031230185615.A9292@in.ibm.com>; from vatsa@in.ibm.com on Tue, Dec 30, 2003 at 06:56:15PM +0530 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Hi, in_atomic() doesn't seem to return true in code sections where IRQ's have been disabled (using local_irq_disable). As a result, I think do_page_fault() on x86 needs to be updated to note this fact: --- fault.c.org Wed Dec 31 18:34:18 2003 +++ fault.c Wed Dec 31 18:35:02 2003 @@ -259,7 +259,7 @@ * If we're in an interrupt, have no user context or are running in an * atomic region then we must not take the fault.. */ - if (in_atomic() || !mm) + if (in_atomic() || irqs_disabled() || !mm) goto bad_area_nosemaphore; down_read(&mm->mmap_sem); For ex: cosider some kernel code like this: local_irq_disable(); /* Do something which leads to a page fault */ local_irq_enable(); Such a code is buggy and should result in a oops(?) when the page fault happens. Without the above patch, the page fault handler will try acquiring the mmap_sem semaphore and can potentially sleep inside schedule (with IRQs disabled). Also isn't the same change required in scheduler() where it whines if called from a code section which is supposed to be atomic. --- sched.c.org Wed Dec 31 18:49:10 2003 +++ sched.c Wed Dec 31 18:49:43 2003 @@ -1507,7 +1507,7 @@ * Otherwise, whine if we are scheduling when we should not be. */ if (likely(!(current->state & (TASK_DEAD | TASK_ZOMBIE)))) { - if (unlikely(in_atomic())) { + if (unlikely(in_atomic() || irqs_disabled())) { printk(KERN_ERR "bad: scheduling while atomic!\n"); dump_stack(); } On Tue, Dec 30, 2003 at 06:56:15PM +0530, Srivatsa Vaddagiri wrote: > On Mon, Dec 29, 2003 at 04:13:38PM +0100, Manfred Spraul wrote: > > > > What do you mean with unable? Could you post what was printed? > > All I used to get was : > > "Debug: sleeping function called from invalid context > at include/linux/rwsem.h:45 > in_atomic: 0, irqs_disabled(): 1 > Call Trace:" > > That's it. Nothing more. Looks like it could not read the > stack at that point and hence couldn't dump the stack traceback. > > I now inserted some printk's in do_page_fault > to print regs->eip before calling down_read i.e: > > /* > * If we're in an interrupt, have no user context or are running in an > * atomic region then we must not take the fault.. > */ > if (in_atomic() || !mm) > goto bad_area_nosemaphore; > > + if (irqs_disabled()) { > + printk("BAD Access at (EIP) %08lx\n", regs->eip); > + printk("Bad Access at virtual address %08lx\n",address); > + } > > down_read(&mm->mmap_sem); > > > This is what I got now when I reran my stress test: > > > BAD Access at (EIP) c011c1b5 > Bad Access at virtual address 05050501 > Debug: sleeping function called from invalid context at include/linux/rwsem.h:47 > in_atomic():0, irqs_disabled():1 > Call Trace: > [] __might_sleep+0x86/0x90 > [] module_unload_free+0x36/0xe0 > [] do_page_fault+0xc9/0x573 > [] buffered_rmqueue+0x10f/0x120 > [] __alloc_pages+0xca/0x360 > [] do_anonymous_page+0x1c4/0x1d0 > [] do_page_fault+0x0/0x573 > [] module_unload_free+0x36/0xe0 > [] error_code+0x2d/0x38 > [<01010101>] > > BAD Access at (EIP) c0139934 > Bad Access at virtual address 0101011f > > > The first EIP (c011c1b5) is inside search_extable!! > The second EIP (c0139934) is inside get_ksymbol() ... > > I suspect the second happened when kdb tried decoding the (first) exception > address and hence is secondary here .. > > The stack trace that follows the first exception seems to be > totally bogus(?) .. I suspect the first exception > happened in search_extable when looking up some module exception > tables(?) ..Because search_module_extables() calls search_extable() with > interrupts disabled .. > > I think this points to some module unload (race) issues during hotplug .. > > Rusty, any comments? > > > > > > > > > > > I guess it's a get_user within either spin_lock_irq() or local_irq_disable. Without more info about the context, it's difficult to figure out if the page fault handler or the caller should be updated > > > Given the context above, I feel it would be correct for > do_page_fault() to avoid calling down_read() when IRQs are > disabled and instead just branch to bad_nosemaphore. > (as Rusty seems to concur) .. However, schedule() doesn't > seem to actually trap the case when it is called with interrupts disabled (as using local_irq_disable)? > > -- > > > Thanks and Regards, > Srivatsa Vaddagiri, > Linux Technology Center, > IBM Software Labs, > Bangalore, INDIA - 560033 > > > ------------------------------------------------------- > This SF.net email is sponsored by: IBM Linux Tutorials. > Become an expert in LINUX or just sharpen your skills. Sign up for IBM's > Free Linux Tutorials. Learn everything from the bash shell to sys admin. > Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click > _______________________________________________ > lhcs-devel mailing list > lhcs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/lhcs-devel -- Thanks and Regards, Srivatsa Vaddagiri, Linux Technology Center, IBM Software Labs, Bangalore, INDIA - 560017