* Re: [patch 2.6.11] __copy_user breaks on unaligned src
2005-03-18 7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
@ 2005-03-18 7:17 ` David Mosberger
2005-03-18 17:40 ` Luck, Tony
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-18 7:17 UTC (permalink / raw)
To: linux-ia64
>>>>> On Fri, 18 Mar 2005 18:04:37 +1100, Keith Owens <kaos@sgi.com> said:
Keith> memcpy_mck.S::__copy_user breaks in the prefetch code under these
Keith> conditions :-
Keith> * src is unaligned and
Keith> * dst is near the end of a page and
Keith> * the page after dst is unmapped.
Keith> The loop count in r21 is 1 value too high. A length of 0x100 gives
Keith> ar.lc = r21 = 2. .unaligned_src incorrectly copies r21 into ar.lc,
Keith> when it should copy cnt, so the lfetch lines are executed 3 times, not
Keith> 2. That takes dst_pre_mem past the end of the page and into an
Keith> unallocated area, oops.
That's a good thing to fix (it's definitely a performance bug). However,
lfetch.fault should be safe to use even on unmapped memory. See this
code in ia64_do_page_fault():
/*
* This fault was due to a speculative load or lfetch.fault, set the "ed"
* bit in the psr to ensure forward progress. (Target register will get a
* NaT for ld.s, lfetch will be canceled.)
*/
I don't see off-hand why this wouldn't work as intended.
--david
^ permalink raw reply [flat|nested] 9+ messages in thread* RE: [patch 2.6.11] __copy_user breaks on unaligned src
2005-03-18 7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
2005-03-18 7:17 ` David Mosberger
@ 2005-03-18 17:40 ` Luck, Tony
2005-03-18 19:19 ` David Mosberger
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Luck, Tony @ 2005-03-18 17:40 UTC (permalink / raw)
To: linux-ia64
>That's a good thing to fix (it's definitely a performance bug). However,
>lfetch.fault should be safe to use even on unmapped memory. See this
>code in ia64_do_page_fault():
>
> /*
> * This fault was due to a speculative load or lfetch.fault, set the "ed"
> * bit in the psr to ensure forward progress. (Target register will get a
> * NaT for ld.s, lfetch will be canceled.)
> */
>
>I don't see off-hand why this wouldn't work as intended.
That reminds me ... someone asked why we don't have the same test for
lfetch a few lines further down in ia64_do_page_fault() in the "no_context"
case (we only test for speculative access there).
The context for their question was using __copy_from_user_inatomic() after
calling inc_preempt_count(). On ia64 this just uses the standard code for
__copy_from_user(). If we happen to fault on the "lfetch", then we end up
in the "no_context" case because of the "if (in_atomic() || !mm)" test at the
top of ia64_do_page_fault.
-Tony
^ permalink raw reply [flat|nested] 9+ messages in thread* RE: [patch 2.6.11] __copy_user breaks on unaligned src
2005-03-18 7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
2005-03-18 7:17 ` David Mosberger
2005-03-18 17:40 ` Luck, Tony
@ 2005-03-18 19:19 ` David Mosberger
2005-03-19 0:58 ` Luck, Tony
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-18 19:19 UTC (permalink / raw)
To: linux-ia64
>>>>> On Fri, 18 Mar 2005 09:40:02 -0800, "Luck, Tony" <tony.luck@intel.com> said:
Tony> That reminds me ... someone asked why we don't have the same
Tony> test for lfetch a few lines further down in
Tony> ia64_do_page_fault() in the "no_context" case (we only test
Tony> for speculative access there).
Clearly a bug. Hopefully, that explains Keith's case as well?
--david
^ permalink raw reply [flat|nested] 9+ messages in thread* RE: [patch 2.6.11] __copy_user breaks on unaligned src
2005-03-18 7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
` (2 preceding siblings ...)
2005-03-18 19:19 ` David Mosberger
@ 2005-03-19 0:58 ` Luck, Tony
2005-03-22 3:04 ` Keith Owens
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Luck, Tony @ 2005-03-19 0:58 UTC (permalink / raw)
To: linux-ia64
> Tony> That reminds me ... someone asked why we don't have the same
> Tony> test for lfetch a few lines further down in
> Tony> ia64_do_page_fault() in the "no_context" case (we only test
> Tony> for speculative access there).
>
>Clearly a bug. Hopefully, that explains Keith's case as well?
Keith: I put the fix for ia64_do_page_fault() into my test-2.6.12 tree, so
you can try it out ... your memcpy fix is in there too.
-Tony
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [patch 2.6.11] __copy_user breaks on unaligned src
2005-03-18 7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
` (3 preceding siblings ...)
2005-03-19 0:58 ` Luck, Tony
@ 2005-03-22 3:04 ` Keith Owens
2005-03-25 1:17 ` David Mosberger
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Keith Owens @ 2005-03-22 3:04 UTC (permalink / raw)
To: linux-ia64
On Thu, 17 Mar 2005 23:17:29 -0800,
David Mosberger <davidm@napali.hpl.hp.com> wrote:
>>>>>> On Fri, 18 Mar 2005 18:04:37 +1100, Keith Owens <kaos@sgi.com> said:
>
> Keith> memcpy_mck.S::__copy_user breaks in the prefetch code under these
> Keith> conditions :-
>
> Keith> * src is unaligned and
> Keith> * dst is near the end of a page and
> Keith> * the page after dst is unmapped.
>
> Keith> The loop count in r21 is 1 value too high. A length of 0x100 gives
> Keith> ar.lc = r21 = 2. .unaligned_src incorrectly copies r21 into ar.lc,
> Keith> when it should copy cnt, so the lfetch lines are executed 3 times, not
> Keith> 2. That takes dst_pre_mem past the end of the page and into an
> Keith> unallocated area, oops.
>
>That's a good thing to fix (it's definitely a performance bug). However,
>lfetch.fault should be safe to use even on unmapped memory. See this
>code in ia64_do_page_fault():
>
> /*
> * This fault was due to a speculative load or lfetch.fault, set the "ed"
> * bit in the psr to ensure forward progress. (Target register will get a
> * NaT for ld.s, lfetch will be canceled.)
> */
>
>I don't see off-hand why this wouldn't work as intended.
It's got me puzzled as well. On my test system, single stepping the
offending instruction _WILL_ cause a fault, but letting it run normally
does not cause an error. A normal run (without single step) definitely
uses lfetch with an invalid address, however ia64_fault() is not
invoked, not even for isr.na.
I am trying to get some time on the big system to reproduce the problem
and see why lfetch is faulting there. Is there any chance that a
concurrent interrupt (the failing system does a lot of I/O) can lose
the lfetch status?
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [patch 2.6.11] __copy_user breaks on unaligned src
2005-03-18 7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
` (4 preceding siblings ...)
2005-03-22 3:04 ` Keith Owens
@ 2005-03-25 1:17 ` David Mosberger
2005-03-25 7:59 ` David Mosberger
2005-03-25 20:27 ` David Mosberger
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-25 1:17 UTC (permalink / raw)
To: linux-ia64
>>>>> On Tue, 22 Mar 2005 14:04:55 +1100, Keith Owens <kaos@sgi.com> said:
>> I don't see off-hand why this wouldn't work as intended.
Keith> It's got me puzzled as well. On my test system, single
Keith> stepping the offending instruction _WILL_ cause a fault, but
Keith> letting it run normally does not cause an error. A normal
Keith> run (without single step) definitely uses lfetch with an
Keith> invalid address, however ia64_fault() is not invoked, not
Keith> even for isr.na.
Keith> I am trying to get some time on the big system to reproduce
Keith> the problem and see why lfetch is faulting there. Is there
Keith> any chance that a concurrent interrupt (the failing system
Keith> does a lot of I/O) can lose the lfetch status?
Hmmh, odd indeed. I changed prefetch()/prefetchw() to use
lfetch.fault and now the kernel dies early on on an lfetch.fault that
goes to address 0 (triggered by find_pid()). Since that's a NaT page,
you'd expect a general exception (NaT consumption). However, the CPU
seems to get stuck in an infinite loop of general exceptions. From
what I can tell, it get to "dispatch_to_fault_handler" and as soon as
it re-enables PSR.IC or perhaps PSR.I (not sure which), it gets
another general exception fault.
--david
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [patch 2.6.11] __copy_user breaks on unaligned src
2005-03-18 7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
` (5 preceding siblings ...)
2005-03-25 1:17 ` David Mosberger
@ 2005-03-25 7:59 ` David Mosberger
2005-03-25 20:27 ` David Mosberger
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-25 7:59 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 24 Mar 2005 17:17:49 -0800, David Mosberger <davidm@linux.hpl.hp.com> said:
>>>>> On Tue, 22 Mar 2005 14:04:55 +1100, Keith Owens <kaos@sgi.com> said:
>>> I don't see off-hand why this wouldn't work as intended.
Keith> It's got me puzzled as well. On my test system, single
Keith> stepping the offending instruction _WILL_ cause a fault, but
Keith> letting it run normally does not cause an error. A normal
Keith> run (without single step) definitely uses lfetch with an
Keith> invalid address, however ia64_fault() is not invoked, not
Keith> even for isr.na.
Keith> I am trying to get some time on the big system to reproduce
Keith> the problem and see why lfetch is faulting there. Is there
Keith> any chance that a concurrent interrupt (the failing system
Keith> does a lot of I/O) can lose the lfetch status?
David> Hmmh, odd indeed. I changed prefetch()/prefetchw() to use
David> lfetch.fault and now the kernel dies early on on an lfetch.fault that
David> goes to address 0 (triggered by find_pid()). Since that's a NaT page,
David> you'd expect a general exception (NaT consumption). However, the CPU
David> seems to get stuck in an infinite loop of general exceptions. From
David> what I can tell, it get to "dispatch_to_fault_handler" and as soon as
David> it re-enables PSR.IC or perhaps PSR.I (not sure which), it gets
David> another general exception fault.
After some more digging, it appears that we do get a vhpt-miss fault
first and for some reason, that handler triggers a (nested) general
exception fault with ISR.code7:4}=3 (IA-64 Reserved Register/Field
fault, Unimplemented Data Address fault". Not sure yet what triggers
the nested fault. The odd thing is that the same kernel works fine
with the Ski simulator (where I do get the expected
ia64_do_page_fault() when find_pid prefetches address 0).
--david
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [patch 2.6.11] __copy_user breaks on unaligned src
2005-03-18 7:04 [patch 2.6.11] __copy_user breaks on unaligned src Keith Owens
` (6 preceding siblings ...)
2005-03-25 7:59 ` David Mosberger
@ 2005-03-25 20:27 ` David Mosberger
7 siblings, 0 replies; 9+ messages in thread
From: David Mosberger @ 2005-03-25 20:27 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 24 Mar 2005 23:59:41 -0800, David Mosberger <davidm@linux.hpl.hp.com> said:
David> After some more digging, it appears that we do get a
David> vhpt-miss fault first and for some reason, that handler
David> triggers a (nested) general exception fault with
David> ISR.code7:4}=3 (IA-64 Reserved Register/Field fault,
David> Unimplemented Data Address fault". Not sure yet what
David> triggers the nested fault.
Well, this turned out to be a bit of a red herring: it was faulting
because the lfetch.fault happened before the Linux page-table-base
register (ar.k7) was initialized. On the real hardware, ar.k7 was
zero and since the lfetch-triggered fault was to address 0, this
caused the vhpt_miss handler to go down in flames.
The attached patch fixes this problem and the machine now boots fine
using lfetch.fault for prefetch()/prefetchw().
Keith: unfortunately, I doubt this will be of any help in tracking
down your problem.
Tony: this patch is perfectly safe and helps make the kernel more
robust, so I'd recommend to apply it soonish.
Thanks,
--david
ia64: Initialize ar.k7 to empty_zero_page early on
Without this initialization, early TLB misses to any user-regions will
cause the TLB miss handlers to go down in flames. Normally, no such
early TLB misses occur, but aggressive use of lfetch.fault can trigger
it easily (e.g., when using lfetch.fault for the
prefetch()/prefetchw() macros we get an early miss for address 0 due
to a prefetch in find_pid()).
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com>
=== arch/ia64/kernel/setup.c 1.90 vs edited ==--- 1.90/arch/ia64/kernel/setup.c 2005-03-23 11:08:32 -08:00
+++ edited/arch/ia64/kernel/setup.c 2005-03-25 12:10:44 -08:00
@@ -711,6 +711,15 @@
ia64_set_kr(IA64_KR_FPU_OWNER, 0);
/*
+ * Initialize the page-table base register to a global
+ * directory with all zeroes. This ensure that we can handle
+ * TLB-misses to user address-space even before we created the
+ * first user address-space. This may happen, e.g., due to
+ * aggressive use of lfetch.fault.
+ */
+ ia64_set_kr(IA64_KR_PT_BASE, __pa(ia64_imva(empty_zero_page)));
+
+ /*
* Initialize default control register to defer all speculative faults. The
* kernel MUST NOT depend on a particular setting of these bits (in other words,
* the kernel must have recovery code for all speculative accesses). Turn on
^ permalink raw reply [flat|nested] 9+ messages in thread