public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* in_atomic doesn't count local_irq_disable?
@ 2003-12-29 13:33 Srivatsa Vaddagiri
  2003-12-29 13:35 ` Srivatsa Vaddagiri
  2003-12-30  2:37 ` Rusty Russell
  0 siblings, 2 replies; 15+ messages in thread
From: Srivatsa Vaddagiri @ 2003-12-29 13:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: lhcs-devel

Hi,
	I am getting messages like:

 "Debug: sleeping function called from invalid context at include/linux/rwsem.h:45"
 "in_atomic: 0, irqs_disabled(): 1"

while running some (CPU Hotplug) tests against (2.6.0-test11-bk6 + the CPU hotplug patch).

This is basically because down_read was called with interrupts disabled ..
__might_sleep was "unable" to dump the stack of callers which 
lead to this problem ..

I put some debug code in down_read (an inline function) and found
that down_read was actually called from do_page_fault.

do_page_fault avoids calling this down_read if we are "in_atomic()"
Isn't in_atomic supposed to count IRQs disabled case? If not
then shouldn't do_page_fault also check for irqs_disabled() 
before calling down_read()?

Please let me know what I am missing here!


-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560033

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: in_atomic doesn't count local_irq_disable?
  2003-12-29 13:33 Srivatsa Vaddagiri
@ 2003-12-29 13:35 ` Srivatsa Vaddagiri
  2003-12-30  2:37 ` Rusty Russell
  1 sibling, 0 replies; 15+ messages in thread
From: Srivatsa Vaddagiri @ 2003-12-29 13:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: lhcs-devel

FYI, I am running with preemption disabled ..

On Mon, Dec 29, 2003 at 07:03:36PM +0530, Srivatsa Vaddagiri wrote:
> Hi,
> 	I am getting messages like:
> 
>  "Debug: sleeping function called from invalid context at include/linux/rwsem.h:45"
>  "in_atomic: 0, irqs_disabled(): 1"
> 
> while running some (CPU Hotplug) tests against (2.6.0-test11-bk6 + the CPU hotplug patch).
> 
> This is basically because down_read was called with interrupts disabled ..
> __might_sleep was "unable" to dump the stack of callers which 
> lead to this problem ..
> 
> I put some debug code in down_read (an inline function) and found
> that down_read was actually called from do_page_fault.
> 
> do_page_fault avoids calling this down_read if we are "in_atomic()"
> Isn't in_atomic supposed to count IRQs disabled case? If not
> then shouldn't do_page_fault also check for irqs_disabled() 
> before calling down_read()?
> 
> Please let me know what I am missing here!
> 
> 
> -- 
> 
> 
> Thanks and Regards,
> Srivatsa Vaddagiri,
> Linux Technology Center,
> IBM Software Labs,
> Bangalore, INDIA - 560033

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560033

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: in_atomic doesn't count local_irq_disable?
@ 2003-12-29 15:13 Manfred Spraul
  2003-12-30 13:26 ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 15+ messages in thread
From: Manfred Spraul @ 2003-12-29 15:13 UTC (permalink / raw)
  To: Srivatsa Vaddagiri; +Cc: linux-kernel

Srivatsa Vaddagiri wrote:

>This is basically because down_read was called with interrupts disabled ..
>__might_sleep was "unable" to dump the stack of callers which 
>lead to this problem ..

What do you mean with unable? Could you post what was printed?

I guess it's a get_user within either spin_lock_irq() or local_irq_disable. Without more info about the context, it's difficult to figure out if the page fault handler or the caller should be updated
--
	Manfred




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: in_atomic doesn't count local_irq_disable?
  2003-12-29 13:33 Srivatsa Vaddagiri
  2003-12-29 13:35 ` Srivatsa Vaddagiri
@ 2003-12-30  2:37 ` Rusty Russell
  1 sibling, 0 replies; 15+ messages in thread
From: Rusty Russell @ 2003-12-30  2:37 UTC (permalink / raw)
  To: vatsa; +Cc: linux-kernel, lhcs-devel

On Mon, 29 Dec 2003 19:03:36 +0530
Srivatsa Vaddagiri <vatsa@in.ibm.com> wrote:

> do_page_fault avoids calling this down_read if we are "in_atomic()"
> Isn't in_atomic supposed to count IRQs disabled case? If not
> then shouldn't do_page_fault also check for irqs_disabled() 
> before calling down_read()?

in_atomic() doesn't actually return true if irqs are disabled.

hence "(in_atomic() || irqs_disabled())" in __might_sleep.

do_page_fault should have the same test...

Thanks,
Rusty.
-- 
   there are those who do and those who hang on and you don't see too
   many doers quoting their contemporaries.  -- Larry McVoy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: in_atomic doesn't count local_irq_disable?
  2003-12-29 15:13 in_atomic doesn't count local_irq_disable? Manfred Spraul
@ 2003-12-30 13:26 ` Srivatsa Vaddagiri
  2003-12-31 13:29   ` BUG in x86 do_page_fault? [was Re: in_atomic doesn't count local_irq_disable?] Srivatsa Vaddagiri
  2003-12-31 13:35   ` [lhcs-devel] Re: in_atomic doesn't count local_irq_disable? Srivatsa Vaddagiri
  0 siblings, 2 replies; 15+ messages in thread
From: Srivatsa Vaddagiri @ 2003-12-30 13:26 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel, rusty, lhcs-devel

On Mon, Dec 29, 2003 at 04:13:38PM +0100, Manfred Spraul wrote:
> 
> What do you mean with unable? Could you post what was printed?

All I used to get was :

"Debug: sleeping function called from invalid context
at include/linux/rwsem.h:45
in_atomic: 0, irqs_disabled(): 1
Call Trace:"

That's it. Nothing more. Looks like it could not read the
stack at that point and hence couldn't dump the stack traceback.

I now inserted some printk's in do_page_fault
to print regs->eip before calling down_read i.e:

        /*
         * If we're in an interrupt, have no user context or are running in an
         * atomic region then we must not take the fault..
         */
        if (in_atomic() || !mm)
                goto bad_area_nosemaphore;

+       if (irqs_disabled()) {
+               printk("BAD Access at (EIP) %08lx\n", regs->eip);
+               printk("Bad Access at virtual address %08lx\n",address);
+       }

        down_read(&mm->mmap_sem);


This is what I got now when I reran my stress test:


BAD Access at (EIP) c011c1b5
Bad Access at virtual address 05050501
Debug: sleeping function called from invalid context at include/linux/rwsem.h:47
in_atomic():0, irqs_disabled():1
Call Trace:
 [<c011fd66>] __might_sleep+0x86/0x90
 [<c01378f6>] module_unload_free+0x36/0xe0
 [<c011b889>] do_page_fault+0xc9/0x573
 [<c013f1df>] buffered_rmqueue+0x10f/0x120
 [<c013f2ba>] __alloc_pages+0xca/0x360
 [<c0148d64>] do_anonymous_page+0x1c4/0x1d0
 [<c011b7c0>] do_page_fault+0x0/0x573
 [<c01378f6>] module_unload_free+0x36/0xe0
 [<c0109d6d>] error_code+0x2d/0x38
 [<01010101>] 

BAD Access at (EIP) c0139934
Bad Access at virtual address 0101011f


The first EIP (c011c1b5) is inside search_extable!!
The second EIP (c0139934) is inside get_ksymbol() ...

I suspect the second happened when kdb tried decoding the (first) exception
address and hence is secondary here ..

The stack trace that follows the first exception seems to be
totally bogus(?) .. I suspect the first exception 
happened in search_extable when looking up some module exception
tables(?) ..Because search_module_extables() calls search_extable() with 
interrupts disabled ..

I think this points to some module unload (race) issues during hotplug ..

Rusty, any comments?







> 
> I guess it's a get_user within either spin_lock_irq() or local_irq_disable. Without more info about the context, it's difficult to figure out if the page fault handler or the caller should be updated


Given the context above, I feel it would be correct for
do_page_fault() to avoid calling down_read() when IRQs are
disabled and instead just branch to bad_nosemaphore.
(as Rusty seems to concur) .. However, schedule() doesn't
seem to actually trap the case when it is called with interrupts disabled (as using local_irq_disable)?

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560033

^ permalink raw reply	[flat|nested] 15+ messages in thread

* BUG in x86 do_page_fault?  [was Re: in_atomic doesn't count local_irq_disable?]
  2003-12-30 13:26 ` Srivatsa Vaddagiri
@ 2003-12-31 13:29   ` Srivatsa Vaddagiri
  2003-12-31 19:08     ` Linus Torvalds
  2003-12-31 13:35   ` [lhcs-devel] Re: in_atomic doesn't count local_irq_disable? Srivatsa Vaddagiri
  1 sibling, 1 reply; 15+ messages in thread
From: Srivatsa Vaddagiri @ 2003-12-31 13:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: manfred, rusty

Hi,
	in_atomic() doesn't seem to return true
in code sections where IRQ's have been disabled (using 
local_irq_disable).

As a result, I think do_page_fault() on x86 needs to 
be updated to note this fact:


--- fault.c.org Wed Dec 31 18:34:18 2003
+++ fault.c     Wed Dec 31 18:35:02 2003
@@ -259,7 +259,7 @@
         * If we're in an interrupt, have no user context or are running in an
         * atomic region then we must not take the fault..
         */
-       if (in_atomic() || !mm)
+       if (in_atomic() || irqs_disabled() || !mm)
                goto bad_area_nosemaphore;

        down_read(&mm->mmap_sem);


For ex: cosider some kernel code like this:


	local_irq_disable();

	/* Do something which leads to a page fault */

	local_irq_enable(); 

Such a code is buggy and should result in a oops(?) when
the page fault happens.

Without the above patch, the page fault handler 
will try acquiring the mmap_sem semaphore and can potentially
sleep inside schedule (with IRQs disabled).


Also isn't the same change  required in scheduler()
where it whines if called from a code section which is supposed to be 
atomic.

--- sched.c.org Wed Dec 31 18:49:10 2003
+++ sched.c     Wed Dec 31 18:49:43 2003
@@ -1507,7 +1507,7 @@
         * Otherwise, whine if we are scheduling when we should not be.
         */
        if (likely(!(current->state & (TASK_DEAD | TASK_ZOMBIE)))) {
-               if (unlikely(in_atomic())) {
+               if (unlikely(in_atomic() || irqs_disabled())) {
                        printk(KERN_ERR "bad: scheduling while atomic!\n");
                        dump_stack();
                }



On Tue, Dec 30, 2003 at 06:56:15PM +0530, Srivatsa Vaddagiri wrote:
> On Mon, Dec 29, 2003 at 04:13:38PM +0100, Manfred Spraul wrote:
> > 
> > What do you mean with unable? Could you post what was printed?
> 
> All I used to get was :
> 
> "Debug: sleeping function called from invalid context
> at include/linux/rwsem.h:45
> in_atomic: 0, irqs_disabled(): 1
> Call Trace:"
> 
> That's it. Nothing more. Looks like it could not read the
> stack at that point and hence couldn't dump the stack traceback.
> 
> I now inserted some printk's in do_page_fault
> to print regs->eip before calling down_read i.e:
> 
>         /*
>          * If we're in an interrupt, have no user context or are running in an
>          * atomic region then we must not take the fault..
>          */
>         if (in_atomic() || !mm)
>                 goto bad_area_nosemaphore;
> 
> +       if (irqs_disabled()) {
> +               printk("BAD Access at (EIP) %08lx\n", regs->eip);
> +               printk("Bad Access at virtual address %08lx\n",address);
> +       }
> 
>         down_read(&mm->mmap_sem);
> 
> 
> This is what I got now when I reran my stress test:
> 
> 
> BAD Access at (EIP) c011c1b5
> Bad Access at virtual address 05050501
> Debug: sleeping function called from invalid context at include/linux/rwsem.h:47
> in_atomic():0, irqs_disabled():1
> Call Trace:
>  [<c011fd66>] __might_sleep+0x86/0x90
>  [<c01378f6>] module_unload_free+0x36/0xe0
>  [<c011b889>] do_page_fault+0xc9/0x573
>  [<c013f1df>] buffered_rmqueue+0x10f/0x120
>  [<c013f2ba>] __alloc_pages+0xca/0x360
>  [<c0148d64>] do_anonymous_page+0x1c4/0x1d0
>  [<c011b7c0>] do_page_fault+0x0/0x573
>  [<c01378f6>] module_unload_free+0x36/0xe0
>  [<c0109d6d>] error_code+0x2d/0x38
>  [<01010101>] 
> 
> BAD Access at (EIP) c0139934
> Bad Access at virtual address 0101011f
> 
> 
> The first EIP (c011c1b5) is inside search_extable!!
> The second EIP (c0139934) is inside get_ksymbol() ...
> 
> I suspect the second happened when kdb tried decoding the (first) exception
> address and hence is secondary here ..
> 
> The stack trace that follows the first exception seems to be
> totally bogus(?) .. I suspect the first exception 
> happened in search_extable when looking up some module exception
> tables(?) ..Because search_module_extables() calls search_extable() with 
> interrupts disabled ..
> 
> I think this points to some module unload (race) issues during hotplug ..
> 
> Rusty, any comments?
> 
> 
> 
> 
> 
> 
> 
> > 
> > I guess it's a get_user within either spin_lock_irq() or local_irq_disable. Without more info about the context, it's difficult to figure out if the page fault handler or the caller should be updated
> 
> 
> Given the context above, I feel it would be correct for
> do_page_fault() to avoid calling down_read() when IRQs are
> disabled and instead just branch to bad_nosemaphore.
> (as Rusty seems to concur) .. However, schedule() doesn't
> seem to actually trap the case when it is called with interrupts disabled (as using local_irq_disable)?
> 
> -- 
> 
> 
> Thanks and Regards,
> Srivatsa Vaddagiri,
> Linux Technology Center,
> IBM Software Labs,
> Bangalore, INDIA - 560033
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IBM Linux Tutorials.
> Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
> Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
> Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
> _______________________________________________
> lhcs-devel mailing list
> lhcs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lhcs-devel

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [lhcs-devel] Re: in_atomic doesn't count local_irq_disable?
  2003-12-30 13:26 ` Srivatsa Vaddagiri
  2003-12-31 13:29   ` BUG in x86 do_page_fault? [was Re: in_atomic doesn't count local_irq_disable?] Srivatsa Vaddagiri
@ 2003-12-31 13:35   ` Srivatsa Vaddagiri
  2004-01-02  0:52     ` Manfred Spraul
  2004-01-02 14:00     ` Srivatsa Vaddagiri
  1 sibling, 2 replies; 15+ messages in thread
From: Srivatsa Vaddagiri @ 2003-12-31 13:35 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel, rusty, lhcs-devel

More debugging reveals that the page fault happens
always while doing a prefetch. The prefetch is
present inside list_for_each_entry macros.

For now I have disabled the x86 prefetch function
to do nothing.

The test seems to run fine so far w/o any of the 
page faults I was experiencing. Will update
at the end of the overnight run if I hit the problem again.

Wonder if prefetch has some issues on Intel x86 (P3) SMP systems?


On Tue, Dec 30, 2003 at 06:56:15PM +0530, Srivatsa Vaddagiri wrote:
> On Mon, Dec 29, 2003 at 04:13:38PM +0100, Manfred Spraul wrote:
> > 
> > What do you mean with unable? Could you post what was printed?
> 
> All I used to get was :
> 
> "Debug: sleeping function called from invalid context
> at include/linux/rwsem.h:45
> in_atomic: 0, irqs_disabled(): 1
> Call Trace:"
> 
> That's it. Nothing more. Looks like it could not read the
> stack at that point and hence couldn't dump the stack traceback.
> 
> I now inserted some printk's in do_page_fault
> to print regs->eip before calling down_read i.e:
> 
>         /*
>          * If we're in an interrupt, have no user context or are running in an
>          * atomic region then we must not take the fault..
>          */
>         if (in_atomic() || !mm)
>                 goto bad_area_nosemaphore;
> 
> +       if (irqs_disabled()) {
> +               printk("BAD Access at (EIP) %08lx\n", regs->eip);
> +               printk("Bad Access at virtual address %08lx\n",address);
> +       }
> 
>         down_read(&mm->mmap_sem);
> 
> 
> This is what I got now when I reran my stress test:
> 
> 
> BAD Access at (EIP) c011c1b5
> Bad Access at virtual address 05050501
> Debug: sleeping function called from invalid context at include/linux/rwsem.h:47
> in_atomic():0, irqs_disabled():1
> Call Trace:
>  [<c011fd66>] __might_sleep+0x86/0x90
>  [<c01378f6>] module_unload_free+0x36/0xe0
>  [<c011b889>] do_page_fault+0xc9/0x573
>  [<c013f1df>] buffered_rmqueue+0x10f/0x120
>  [<c013f2ba>] __alloc_pages+0xca/0x360
>  [<c0148d64>] do_anonymous_page+0x1c4/0x1d0
>  [<c011b7c0>] do_page_fault+0x0/0x573
>  [<c01378f6>] module_unload_free+0x36/0xe0
>  [<c0109d6d>] error_code+0x2d/0x38
>  [<01010101>] 
> 
> BAD Access at (EIP) c0139934
> Bad Access at virtual address 0101011f
> 
> 
> The first EIP (c011c1b5) is inside search_extable!!
> The second EIP (c0139934) is inside get_ksymbol() ...
> 
> I suspect the second happened when kdb tried decoding the (first) exception
> address and hence is secondary here ..
> 
> The stack trace that follows the first exception seems to be
> totally bogus(?) .. I suspect the first exception 
> happened in search_extable when looking up some module exception
> tables(?) ..Because search_module_extables() calls search_extable() with 
> interrupts disabled ..
> 
> I think this points to some module unload (race) issues during hotplug ..
> 
> Rusty, any comments?
> 
> 
> 
> 
> 
> 
> 
> > 
> > I guess it's a get_user within either spin_lock_irq() or local_irq_disable. Without more info about the context, it's difficult to figure out if the page fault handler or the caller should be updated
> 
> 
> Given the context above, I feel it would be correct for
> do_page_fault() to avoid calling down_read() when IRQs are
> disabled and instead just branch to bad_nosemaphore.
> (as Rusty seems to concur) .. However, schedule() doesn't
> seem to actually trap the case when it is called with interrupts disabled (as using local_irq_disable)?
> 
> -- 
> 
> 
> Thanks and Regards,
> Srivatsa Vaddagiri,
> Linux Technology Center,
> IBM Software Labs,
> Bangalore, INDIA - 560033
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IBM Linux Tutorials.
> Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
> Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
> Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
> _______________________________________________
> lhcs-devel mailing list
> lhcs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lhcs-devel

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in x86 do_page_fault?  [was Re: in_atomic doesn't count local_irq_disable?]
  2003-12-31 13:29   ` BUG in x86 do_page_fault? [was Re: in_atomic doesn't count local_irq_disable?] Srivatsa Vaddagiri
@ 2003-12-31 19:08     ` Linus Torvalds
  2004-01-04 14:57       ` Pavel Machek
  2004-03-29 15:42       ` Pavel Machek
  0 siblings, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2003-12-31 19:08 UTC (permalink / raw)
  To: Srivatsa Vaddagiri; +Cc: Kernel Mailing List, manfred, rusty, Andrew Morton



On Wed, 31 Dec 2003, Srivatsa Vaddagiri wrote:
>
> 	in_atomic() doesn't seem to return true
> in code sections where IRQ's have been disabled (using 
> local_irq_disable).
> 
> As a result, I think do_page_fault() on x86 needs to 
> be updated to note this fact:

NO. 

Please don't do this, it will result in some _really_ nasty problems with 
X and other programs that potentially disable interrupts in user space.

Also, there are broken old drivers that potentially have interrupts 
disabled, and we shouldn't just oops them. We should have a warning, but 
we already do have that: that's what "might_sleep()" does.

So something like this may be appropriate at some point, but not in this 
format. At the very least you absolutely _have_ to check for user mode 
(possibly in the same place where we now have that

	/* It's safe to allow irq's after cr2 has been saved */

comment).

		Lnus

> --- fault.c.org Wed Dec 31 18:34:18 2003
> +++ fault.c     Wed Dec 31 18:35:02 2003
> @@ -259,7 +259,7 @@
>          * If we're in an interrupt, have no user context or are running in an
>          * atomic region then we must not take the fault..
>          */
> -       if (in_atomic() || !mm)
> +       if (in_atomic() || irqs_disabled() || !mm)
>                 goto bad_area_nosemaphore;
> 
>         down_read(&mm->mmap_sem);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [lhcs-devel] Re: in_atomic doesn't count local_irq_disable?
  2003-12-31 13:35   ` [lhcs-devel] Re: in_atomic doesn't count local_irq_disable? Srivatsa Vaddagiri
@ 2004-01-02  0:52     ` Manfred Spraul
  2004-01-02 10:56       ` Srivatsa Vaddagiri
  2004-01-02 14:00     ` Srivatsa Vaddagiri
  1 sibling, 1 reply; 15+ messages in thread
From: Manfred Spraul @ 2004-01-02  0:52 UTC (permalink / raw)
  To: vatsa; +Cc: linux-kernel, rusty, lhcs-devel

Srivatsa Vaddagiri wrote:

>More debugging reveals that the page fault happens
>always while doing a prefetch. The prefetch is
>present inside list_for_each_entry macros.
>
>For now I have disabled the x86 prefetch function
>to do nothing.
>
>The test seems to run fine so far w/o any of the 
>page faults I was experiencing. Will update
>at the end of the overnight run if I hit the problem again.
>
>Wonder if prefetch has some issues on Intel x86 (P3) SMP systems?
>  
>
Hmm. Perhaps prefetch updates CR2?
We know already that the CR2 is not directly linked to the page fault 
interrupt - if a page fault happens at the same time as a higher 
priority event (iirc hw interrupt), then CR2 is updated and the higher 
priority event is handled. That prevents Linux from using CR2 to store 
the cpu number - only netware can do that, because netware never causes 
paging faults.

Could you write a test module that reads cr2, executes a few prefetch 
instructions and then checks if cr2 changed? I won't have access to my 
P3 SMP system in the next few days.

--
    Manfred



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [lhcs-devel] Re: in_atomic doesn't count local_irq_disable?
  2004-01-02  0:52     ` Manfred Spraul
@ 2004-01-02 10:56       ` Srivatsa Vaddagiri
  0 siblings, 0 replies; 15+ messages in thread
From: Srivatsa Vaddagiri @ 2004-01-02 10:56 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel, rusty, lhcs-devel

On Fri, Jan 02, 2004 at 01:52:07AM +0100, Manfred Spraul wrote:
> Could you write a test module that reads cr2, executes a few prefetch 
> instructions and then checks if cr2 changed? I won't have access to my 
> P3 SMP system in the next few days.
 
Hi Manfred,
	I wrote a test module and found that CR2 remains same across the 
prefetch. The module source I used is as below. Note that I had
to used "my_prefetch" because the original prefetch (in asm/processor.h) 
has been disabled in my tree to do nothing.



inline void my_prefetch(const void *x)
{
        alternative_input(ASM_NOP4,
                          "prefetchnta (%1)",
                          X86_FEATURE_XMM,
                          "r" (x));
}

int array[10];

static int __init dummy_init_module(void)
{
        unsigned long address;
        int i=0;
        int x;

        /* get the address */
        __asm__("movl %%cr2,%0":"=r" (address));

        printk ("CR2 before prefetch is %x \n", address);

        for (i=0; i<10; ++i)
                my_prefetch(array+i);

        for (i=0; i<10; ++i)
                x = *(array+i);

        /* get the address */
        __asm__("movl %%cr2,%0":"=r" (address));


        printk ("CR2 after prefetch is %x \n", address);


        return 0;

}


static void __exit dummy_cleanup_module(void)
{
}

module_init(dummy_init_module);
module_exit(dummy_cleanup_module);
MODULE_LICENSE("GPL");


Output of the above printk is :


CR2 before prefetch is 40017000
CR2 after prefetch is 40017000








-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [lhcs-devel] Re: in_atomic doesn't count local_irq_disable?
  2003-12-31 13:35   ` [lhcs-devel] Re: in_atomic doesn't count local_irq_disable? Srivatsa Vaddagiri
  2004-01-02  0:52     ` Manfred Spraul
@ 2004-01-02 14:00     ` Srivatsa Vaddagiri
  1 sibling, 0 replies; 15+ messages in thread
From: Srivatsa Vaddagiri @ 2004-01-02 14:00 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: linux-kernel, rusty, lhcs-devel, jun.nakajima, asit.k.mallick,
	sunil.saxena, torvalds


On Wed, Dec 31, 2003 at 07:05:53PM +0530, Srivatsa Vaddagiri wrote:
> More debugging reveals that the page fault happens
> always while doing a prefetch. The prefetch is
> present inside list_for_each_entry macros.
> 
> For now I have disabled the x86 prefetch function
> to do nothing.
> 
> The test seems to run fine so far w/o any of the 
> page faults I was experiencing. Will update
> at the end of the overnight run if I hit the problem again.
> 
> Wonder if prefetch has some issues on Intel x86 (P3) SMP systems?
> 

Even after disabling prefetch, I continue
to hit page-faults. 

With prefetch disabled, it _always_
traps because of trying to dereference a NULL pointer in a
list-head.  If I break-in into the debugger, the 
list-head is actually valid (no NULL pointer is present) and 
hence I don't understand why it read a NULL pointer value in
a list head.

With prefetch enabled it traps
because of fetch from arbitrary (random) addresses.

So I am no longer sure if this is a prefetch issue
or/and a hotplug issue. I'm continuing to investigate
and will post any new observation I make here.

FYI, the /proc/cpuinfo output on my SMP system is as below:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 10
model name      : Pentium III (Cascades)
stepping        : 1
cpu MHz         : 699.730
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips        : 1376.25

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 10
model name      : Pentium III (Cascades)
stepping        : 1
cpu MHz         : 699.726
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips        : 1396.73


processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 10
model name      : Pentium III (Cascades)
stepping        : 1
cpu MHz         : 699.726
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips        : 1396.73


processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 10
model name      : Pentium III (Cascades)
stepping        : 1
cpu MHz         : 699.726
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips        : 1396.73





-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in x86 do_page_fault?  [was Re: in_atomic doesn't count local_irq_disable?]
  2003-12-31 19:08     ` Linus Torvalds
@ 2004-01-04 14:57       ` Pavel Machek
  2004-01-04 20:43         ` Linus Torvalds
  2004-03-29 15:43         ` Linus Torvalds
  2004-03-29 15:42       ` Pavel Machek
  1 sibling, 2 replies; 15+ messages in thread
From: Pavel Machek @ 2004-01-04 14:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Srivatsa Vaddagiri, Kernel Mailing List, manfred, rusty,
	Andrew Morton

Hi!

> > 	in_atomic() doesn't seem to return true
> > in code sections where IRQ's have been disabled (using 
> > local_irq_disable).
> > 
> > As a result, I think do_page_fault() on x86 needs to 
> > be updated to note this fact:
> 
> NO. 
> 
> Please don't do this, it will result in some _really_ nasty problems with 
> X and other programs that potentially disable interrupts in user
> space.

If user program causes page fault with interrupts disabled, it is
certainly buggy, right?

Either the user program does not really need irq disabled or it does
need that but page fault just broke its guarantees (=> severe problems
ahead).

In both cases there's user program that needs fixing.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in x86 do_page_fault?  [was Re: in_atomic doesn't count local_irq_disable?]
  2004-01-04 14:57       ` Pavel Machek
@ 2004-01-04 20:43         ` Linus Torvalds
  2004-03-29 15:43         ` Linus Torvalds
  1 sibling, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2004-01-04 20:43 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Srivatsa Vaddagiri, Kernel Mailing List, manfred, rusty,
	Andrew Morton



On Sun, 4 Jan 2004, Pavel Machek wrote:
> > 
> > Please don't do this, it will result in some _really_ nasty problems with 
> > X and other programs that potentially disable interrupts in user
> > space.
> 
> If user program causes page fault with interrupts disabled, it is
> certainly buggy, right?

No.

It may do a best-effort thing. It may also do a best-_performance_ thing, 
in leaving interrupts disabled over a piece of code that doesn't care, 
knowing that disabling interrupts is expensive.  Or it may just be a 
simple case of simplicity: disable interrupts over the whole region, 
knowing that only a part of it matters.

It by no means is automatically a bug. And it unquestionably _does_ 
happen. We used to warn about it. We stopped.

		Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in x86 do_page_fault?  [was Re: in_atomic doesn't count local_irq_disable?]
  2003-12-31 19:08     ` Linus Torvalds
  2004-01-04 14:57       ` Pavel Machek
@ 2004-03-29 15:42       ` Pavel Machek
  1 sibling, 0 replies; 15+ messages in thread
From: Pavel Machek @ 2004-03-29 15:42 UTC (permalink / raw)
  To: Administrator
  Cc: Srivatsa Vaddagiri, Kernel Mailing List, manfred, rusty,
	Andrew Morton

Hi!

> > 	in_atomic() doesn't seem to return true
> > in code sections where IRQ's have been disabled (using 
> > local_irq_disable).
> > 
> > As a result, I think do_page_fault() on x86 needs to 
> > be updated to note this fact:
> 
> NO. 
> 
> Please don't do this, it will result in some _really_ nasty problems with 
> X and other programs that potentially disable interrupts in user
> space.

If user program causes page fault with interrupts disabled, it is
certainly buggy, right?

Either the user program does not really need irq disabled or it does
need that but page fault just broke its guarantees (=> severe problems
ahead).

In both cases there's user program that needs fixing.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BUG in x86 do_page_fault?  [was Re: in_atomic doesn't count local_irq_disable?]
  2004-01-04 14:57       ` Pavel Machek
  2004-01-04 20:43         ` Linus Torvalds
@ 2004-03-29 15:43         ` Linus Torvalds
  1 sibling, 0 replies; 15+ messages in thread
From: Linus Torvalds @ 2004-03-29 15:43 UTC (permalink / raw)
  To: Administrator
  Cc: Srivatsa Vaddagiri, Kernel Mailing List, manfred, rusty,
	Andrew Morton



On Sun, 4 Jan 2004, Pavel Machek wrote:
> > 
> > Please don't do this, it will result in some _really_ nasty problems with 
> > X and other programs that potentially disable interrupts in user
> > space.
> 
> If user program causes page fault with interrupts disabled, it is
> certainly buggy, right?

No.

It may do a best-effort thing. It may also do a best-_performance_ thing, 
in leaving interrupts disabled over a piece of code that doesn't care, 
knowing that disabling interrupts is expensive.  Or it may just be a 
simple case of simplicity: disable interrupts over the whole region, 
knowing that only a part of it matters.

It by no means is automatically a bug. And it unquestionably _does_ 
happen. We used to warn about it. We stopped.

		Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2004-03-29 15:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-29 15:13 in_atomic doesn't count local_irq_disable? Manfred Spraul
2003-12-30 13:26 ` Srivatsa Vaddagiri
2003-12-31 13:29   ` BUG in x86 do_page_fault? [was Re: in_atomic doesn't count local_irq_disable?] Srivatsa Vaddagiri
2003-12-31 19:08     ` Linus Torvalds
2004-01-04 14:57       ` Pavel Machek
2004-01-04 20:43         ` Linus Torvalds
2004-03-29 15:43         ` Linus Torvalds
2004-03-29 15:42       ` Pavel Machek
2003-12-31 13:35   ` [lhcs-devel] Re: in_atomic doesn't count local_irq_disable? Srivatsa Vaddagiri
2004-01-02  0:52     ` Manfred Spraul
2004-01-02 10:56       ` Srivatsa Vaddagiri
2004-01-02 14:00     ` Srivatsa Vaddagiri
  -- strict thread matches above, loose matches on Subject: below --
2003-12-29 13:33 Srivatsa Vaddagiri
2003-12-29 13:35 ` Srivatsa Vaddagiri
2003-12-30  2:37 ` Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox