[patch 2.6.11] __copy_user breaks on unaligned src

public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed

* [patch 2.6.11] __copy_user breaks on unaligned src
@ 2005-03-18  7:04 Keith Owens
  2005-03-18  7:17 ` David Mosberger
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Keith Owens @ 2005-03-18  7:04 UTC (permalink / raw)
  To: linux-ia64

memcpy_mck.S::__copy_user breaks in the prefetch code under these
conditions :-

* src is unaligned and
* dst is near the end of a page and
* the page after dst is unmapped.

Signed-off-by: Keith Owens <kaos@sgi.com>

Index: linux/arch/ia64/lib/memcpy_mck.S
=================================--- linux.orig/arch/ia64/lib/memcpy_mck.S	2005-03-02 18:38:38.000000000 +1100
+++ linux/arch/ia64/lib/memcpy_mck.S	2005-03-18 17:49:44.000000000 +1100
@@ -300,7 +300,7 @@ .unaligned_src
 	add	src_pre_mem=0,src0	// prefetch src pointer
 	add	dst_pre_mem=0,dst0	// prefetch dest pointer
 	and	src0=-8,src0		// 1st src pointer
-(p7)	mov	ar.lc = r21
+(p7)	mov	ar.lc = cnt
 (p8)	mov	ar.lc = r0
 	;;
 	TEXT_ALIGN(32)

Test module follows.  vmalloc two pages, they should be contiguous.
Free the second page.  Using the first page, copy from an unaligned src
to the end of the page - 0x100, for a length of 0x100.  __copy_user
breaks at lfetch.fault.excl [dst_pre_mem], 128 in .unaligned_src.

The loop count in r21 is 1 value too high.  A length of 0x100 gives
ar.lc = r21 = 2.  .unaligned_src incorrectly copies r21 into ar.lc,
when it should copy cnt, so the lfetch lines are executed 3 times, not
2.  That takes dst_pre_mem past the end of the page and into an
unallocated area, oops.

#include <linux/config.h>
#include <linux/vmalloc.h>
#include <linux/module.h>
#include <asm/uaccess.h>

MODULE_LICENSE("GPL");

static int __init init_memcpy_test(void)
{
	char *p, *p1;
	printk("%s: start\n", __FUNCTION__);
	p = vmalloc(PAGE_SIZE);
	p1 = vmalloc(PAGE_SIZE);
	printk("%s: p %p p1 %p\n", __FUNCTION__, p, p1);
	vfree(p1);
	__copy_user(p+PAGE_SIZE-0x100, p+0x854, 0x100);
	vfree(p);
	printk("%s: end\n", __FUNCTION__);
	return 0;
}

static void __exit exit_memcpy_test(void) {}

module_init(init_memcpy_test)
module_exit(exit_memcpy_test)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch 2.6.11] __copy_user breaks on unaligned src
  2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
@ 2005-03-18  7:17 ` David Mosberger
  2005-03-18 17:40 ` Luck, Tony
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-18  7:17 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 18 Mar 2005 18:04:37 +1100, Keith Owens <kaos@sgi.com> said:

  Keith> memcpy_mck.S::__copy_user breaks in the prefetch code under these
  Keith> conditions :-

  Keith> * src is unaligned and
  Keith> * dst is near the end of a page and
  Keith> * the page after dst is unmapped.

  Keith> The loop count in r21 is 1 value too high.  A length of 0x100 gives
  Keith> ar.lc = r21 = 2.  .unaligned_src incorrectly copies r21 into ar.lc,
  Keith> when it should copy cnt, so the lfetch lines are executed 3 times, not
  Keith> 2.  That takes dst_pre_mem past the end of the page and into an
  Keith> unallocated area, oops.

That's a good thing to fix (it's definitely a performance bug).  However,
lfetch.fault should be safe to use even on unmapped memory.  See this
code in ia64_do_page_fault():

 /*
  * This fault was due to a speculative load or lfetch.fault, set the "ed"
  * bit in the psr to ensure forward progress.  (Target register will get a
  * NaT for ld.s, lfetch will be canceled.)
  */

I don't see off-hand why this wouldn't work as intended.

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [patch 2.6.11] __copy_user breaks on unaligned src
  2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
  2005-03-18  7:17 ` David Mosberger
@ 2005-03-18 17:40 ` Luck, Tony
  2005-03-18 19:19 ` David Mosberger
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Luck, Tony @ 2005-03-18 17:40 UTC (permalink / raw)
  To: linux-ia64

>That's a good thing to fix (it's definitely a performance bug).  However,
>lfetch.fault should be safe to use even on unmapped memory.  See this
>code in ia64_do_page_fault():
>
> /*
>  * This fault was due to a speculative load or lfetch.fault, set the "ed"
>  * bit in the psr to ensure forward progress.  (Target register will get a
>  * NaT for ld.s, lfetch will be canceled.)
>  */
>
>I don't see off-hand why this wouldn't work as intended.

That reminds me ... someone asked why we don't have the same test for
lfetch a few lines further down in ia64_do_page_fault() in the "no_context"
case (we only test for speculative access there).

The context for their question was using __copy_from_user_inatomic() after
calling inc_preempt_count().  On ia64 this just uses the standard code for
__copy_from_user().  If we happen to fault on the "lfetch", then we end up
in the "no_context" case because of the "if (in_atomic() || !mm)" test at the
top of ia64_do_page_fault.

-Tony

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [patch 2.6.11] __copy_user breaks on unaligned src
  2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
  2005-03-18  7:17 ` David Mosberger
  2005-03-18 17:40 ` Luck, Tony
@ 2005-03-18 19:19 ` David Mosberger
  2005-03-19  0:58 ` Luck, Tony
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-18 19:19 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 18 Mar 2005 09:40:02 -0800, "Luck, Tony" <tony.luck@intel.com> said:

  Tony> That reminds me ... someone asked why we don't have the same
  Tony> test for lfetch a few lines further down in
  Tony> ia64_do_page_fault() in the "no_context" case (we only test
  Tony> for speculative access there).

Clearly a bug.  Hopefully, that explains Keith's case as well?

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [patch 2.6.11] __copy_user breaks on unaligned src
  2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
                   ` (2 preceding siblings ...)
  2005-03-18 19:19 ` David Mosberger
@ 2005-03-19  0:58 ` Luck, Tony
  2005-03-22  3:04 ` Keith Owens
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Luck, Tony @ 2005-03-19  0:58 UTC (permalink / raw)
  To: linux-ia64

>  Tony> That reminds me ... someone asked why we don't have the same
>  Tony> test for lfetch a few lines further down in
>  Tony> ia64_do_page_fault() in the "no_context" case (we only test
>  Tony> for speculative access there).
>
>Clearly a bug.  Hopefully, that explains Keith's case as well?

Keith: I put the fix for ia64_do_page_fault() into my test-2.6.12 tree, so
you can try it out ... your memcpy fix is in there too.

-Tony

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch 2.6.11] __copy_user breaks on unaligned src
  2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
                   ` (3 preceding siblings ...)
  2005-03-19  0:58 ` Luck, Tony
@ 2005-03-22  3:04 ` Keith Owens
  2005-03-25  1:17 ` David Mosberger
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Keith Owens @ 2005-03-22  3:04 UTC (permalink / raw)
  To: linux-ia64

On Thu, 17 Mar 2005 23:17:29 -0800, 
David Mosberger <davidm@napali.hpl.hp.com> wrote:
>>>>>> On Fri, 18 Mar 2005 18:04:37 +1100, Keith Owens <kaos@sgi.com> said:
>
>  Keith> memcpy_mck.S::__copy_user breaks in the prefetch code under these
>  Keith> conditions :-
>
>  Keith> * src is unaligned and
>  Keith> * dst is near the end of a page and
>  Keith> * the page after dst is unmapped.
>
>  Keith> The loop count in r21 is 1 value too high.  A length of 0x100 gives
>  Keith> ar.lc = r21 = 2.  .unaligned_src incorrectly copies r21 into ar.lc,
>  Keith> when it should copy cnt, so the lfetch lines are executed 3 times, not
>  Keith> 2.  That takes dst_pre_mem past the end of the page and into an
>  Keith> unallocated area, oops.
>
>That's a good thing to fix (it's definitely a performance bug).  However,
>lfetch.fault should be safe to use even on unmapped memory.  See this
>code in ia64_do_page_fault():
>
> /*
>  * This fault was due to a speculative load or lfetch.fault, set the "ed"
>  * bit in the psr to ensure forward progress.  (Target register will get a
>  * NaT for ld.s, lfetch will be canceled.)
>  */
>
>I don't see off-hand why this wouldn't work as intended.

It's got me puzzled as well.  On my test system, single stepping the
offending instruction _WILL_ cause a fault, but letting it run normally
does not cause an error.  A normal run (without single step) definitely
uses lfetch with an invalid address, however ia64_fault() is not
invoked, not even for isr.na.

I am trying to get some time on the big system to reproduce the problem
and see why lfetch is faulting there.  Is there any chance that a
concurrent interrupt (the failing system does a lot of I/O) can lose
the lfetch status?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch 2.6.11] __copy_user breaks on unaligned src
  2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
                   ` (4 preceding siblings ...)
  2005-03-22  3:04 ` Keith Owens
@ 2005-03-25  1:17 ` David Mosberger
  2005-03-25  7:59 ` David Mosberger
  2005-03-25 20:27 ` David Mosberger
  7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-25  1:17 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Tue, 22 Mar 2005 14:04:55 +1100, Keith Owens <kaos@sgi.com> said:

  >> I don't see off-hand why this wouldn't work as intended.

  Keith> It's got me puzzled as well.  On my test system, single
  Keith> stepping the offending instruction _WILL_ cause a fault, but
  Keith> letting it run normally does not cause an error.  A normal
  Keith> run (without single step) definitely uses lfetch with an
  Keith> invalid address, however ia64_fault() is not invoked, not
  Keith> even for isr.na.

  Keith> I am trying to get some time on the big system to reproduce
  Keith> the problem and see why lfetch is faulting there.  Is there
  Keith> any chance that a concurrent interrupt (the failing system
  Keith> does a lot of I/O) can lose the lfetch status?

Hmmh, odd indeed.  I changed prefetch()/prefetchw() to use
lfetch.fault and now the kernel dies early on on an lfetch.fault that
goes to address 0 (triggered by find_pid()).  Since that's a NaT page,
you'd expect a general exception (NaT consumption).  However, the CPU
seems to get stuck in an infinite loop of general exceptions.  From
what I can tell, it get to "dispatch_to_fault_handler" and as soon as
it re-enables PSR.IC or perhaps PSR.I (not sure which), it gets
another general exception fault.

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch 2.6.11] __copy_user breaks on unaligned src
  2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
                   ` (5 preceding siblings ...)
  2005-03-25  1:17 ` David Mosberger
@ 2005-03-25  7:59 ` David Mosberger
  2005-03-25 20:27 ` David Mosberger
  7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-25  7:59 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Thu, 24 Mar 2005 17:17:49 -0800, David Mosberger <davidm@linux.hpl.hp.com> said:

>>>>> On Tue, 22 Mar 2005 14:04:55 +1100, Keith Owens <kaos@sgi.com> said:
  >>> I don't see off-hand why this wouldn't work as intended.

  Keith> It's got me puzzled as well.  On my test system, single
  Keith> stepping the offending instruction _WILL_ cause a fault, but
  Keith> letting it run normally does not cause an error.  A normal
  Keith> run (without single step) definitely uses lfetch with an
  Keith> invalid address, however ia64_fault() is not invoked, not
  Keith> even for isr.na.

  Keith> I am trying to get some time on the big system to reproduce
  Keith> the problem and see why lfetch is faulting there.  Is there
  Keith> any chance that a concurrent interrupt (the failing system
  Keith> does a lot of I/O) can lose the lfetch status?

  David> Hmmh, odd indeed.  I changed prefetch()/prefetchw() to use
  David> lfetch.fault and now the kernel dies early on on an lfetch.fault that
  David> goes to address 0 (triggered by find_pid()).  Since that's a NaT page,
  David> you'd expect a general exception (NaT consumption).  However, the CPU
  David> seems to get stuck in an infinite loop of general exceptions.  From
  David> what I can tell, it get to "dispatch_to_fault_handler" and as soon as
  David> it re-enables PSR.IC or perhaps PSR.I (not sure which), it gets
  David> another general exception fault.

After some more digging, it appears that we do get a vhpt-miss fault
first and for some reason, that handler triggers a (nested) general
exception fault with ISR.code7:4}=3 (IA-64 Reserved Register/Field
fault, Unimplemented Data Address fault".  Not sure yet what triggers
the nested fault.  The odd thing is that the same kernel works fine
with the Ski simulator (where I do get the expected
ia64_do_page_fault() when find_pid prefetches address 0).

	--david

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch 2.6.11] __copy_user breaks on unaligned src
  2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
                   ` (6 preceding siblings ...)
  2005-03-25  7:59 ` David Mosberger
@ 2005-03-25 20:27 ` David Mosberger
  7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-25 20:27 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Thu, 24 Mar 2005 23:59:41 -0800, David Mosberger <davidm@linux.hpl.hp.com> said:

  David> After some more digging, it appears that we do get a
  David> vhpt-miss fault first and for some reason, that handler
  David> triggers a (nested) general exception fault with
  David> ISR.code7:4}=3 (IA-64 Reserved Register/Field fault,
  David> Unimplemented Data Address fault".  Not sure yet what
  David> triggers the nested fault.

Well, this turned out to be a bit of a red herring: it was faulting
because the lfetch.fault happened before the Linux page-table-base
register (ar.k7) was initialized.  On the real hardware, ar.k7 was
zero and since the lfetch-triggered fault was to address 0, this
caused the vhpt_miss handler to go down in flames.

The attached patch fixes this problem and the machine now boots fine
using lfetch.fault for prefetch()/prefetchw().

Keith: unfortunately, I doubt this will be of any help in tracking
down your problem.

Tony: this patch is perfectly safe and helps make the kernel more
robust, so I'd recommend to apply it soonish.

Thanks,
	--david

ia64: Initialize ar.k7 to empty_zero_page early on

Without this initialization, early TLB misses to any user-regions will
cause the TLB miss handlers to go down in flames.  Normally, no such
early TLB misses occur, but aggressive use of lfetch.fault can trigger
it easily (e.g., when using lfetch.fault for the
prefetch()/prefetchw() macros we get an early miss for address 0 due
to a prefetch in find_pid()).

Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>

=== arch/ia64/kernel/setup.c 1.90 vs edited ==--- 1.90/arch/ia64/kernel/setup.c	2005-03-23 11:08:32 -08:00
+++ edited/arch/ia64/kernel/setup.c	2005-03-25 12:10:44 -08:00
@@ -711,6 +711,15 @@
 	ia64_set_kr(IA64_KR_FPU_OWNER, 0);

 	/*
+	 * Initialize the page-table base register to a global
+	 * directory with all zeroes.  This ensure that we can handle
+	 * TLB-misses to user address-space even before we created the
+	 * first user address-space.  This may happen, e.g., due to
+	 * aggressive use of lfetch.fault.
+	 */
+	ia64_set_kr(IA64_KR_PT_BASE, __pa(ia64_imva(empty_zero_page)));
+
+	/*
 	 * Initialize default control register to defer all speculative faults.  The
 	 * kernel MUST NOT depend on a particular setting of these bits (in other words,
 	 * the kernel must have recovery code for all speculative accesses).  Turn on

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2005-03-25 20:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-18  7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
2005-03-18  7:17 ` David Mosberger
2005-03-18 17:40 ` Luck, Tony
2005-03-18 19:19 ` David Mosberger
2005-03-19  0:58 ` Luck, Tony
2005-03-22  3:04 ` Keith Owens
2005-03-25  1:17 ` David Mosberger
2005-03-25  7:59 ` David Mosberger
2005-03-25 20:27 ` David Mosberger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox