* Re: Fixing up unaligned userspace access
2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
@ 2008-03-07 14:44 ` Adrian McMenamin
2008-03-12 18:41 ` Kieran Bingham
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Adrian McMenamin @ 2008-03-07 14:44 UTC (permalink / raw)
To: linux-sh
On Fri, March 7, 2008 2:30 pm, Kieran Bingham wrote:
> Hi guys,
>
> I'm getting confused about how the unaligned accesses are being handled
> in SH2/SH2a.
>
> Our busybox build and userland seems to be causing these exceptions far
> too often.
>
> For example, pressing "Tab" to tab complete a filename, causes these two
> exceptions to fire, and then soon after the whole board stalls.
>
>
>
> Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
> ins=0x60f0
>
> Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
> ins=0x9015
>
>
>
> Mainly I'm confused though, as looking at the assembly the instructions
> aren't unaligned ? and don't seem to be trying to do anything
> un-aligned ? - Is there something else that could be going on here?
>
My experience - admittedly with SH4 - is that these errors are almost
always caused by a memory leak, or access to unitialised memory, elsewhere
in the kernel.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Fixing up unaligned userspace access
2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
2008-03-07 14:44 ` Adrian McMenamin
@ 2008-03-12 18:41 ` Kieran Bingham
2008-03-13 10:39 ` Paul Mundt
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Kieran Bingham @ 2008-03-12 18:41 UTC (permalink / raw)
To: linux-sh
Think I found the fault ! :)
On 07/03/2008, Adrian McMenamin <adrian@newgolddream.dyndns.info> wrote:
> On Fri, March 7, 2008 2:30 pm, Kieran Bingham wrote:
> >
> > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
> > ins=0x60f0
> >
> > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
> > ins=0x9015
> > Mainly I'm confused though, as looking at the assembly the instructions
> > aren't unaligned ? and don't seem to be trying to do anything
> > un-aligned ? - Is there something else that could be going on here?
> My experience - admittedly with SH4 - is that these errors are almost
> always caused by a memory leak, or access to uninitialized memory, elsewhere
> in the kernel.
The instructions don't look like they perform anything bad, because they don't!
They are the wrong instructions. ... would seem that the address error
trap handler is adding 4 to the regs pointer before it calls
do_address_error, so regs->pc was actually returning the PR!!
Does anyone know why this code adds 4 ?
Can we remove it if its incorrect (patch below)
I'm working on SH2a, so I don't know if its an SH2 specific thing
thats been put in ?
Remove erroneous offset on SH2a address error handling
Signed-off-by: Kieran Bingham <kbingham@mpc-data.co.uk>
---
diff --git a/arch/sh/kernel/cpu/sh2/entry.S b/arch/sh/kernel/cpu/sh2/entry.S
index 7a26569..0fc8906 100644
--- a/arch/sh/kernel/cpu/sh2/entry.S
+++ b/arch/sh/kernel/cpu/sh2/entry.S
@@ -267,7 +267,6 @@ ENTRY(sh_bios_handler)
ENTRY(address_error_trap_handler)
mov r15,r4 ! regs
- add #4,r4
mov #OFF_PC,r0
mov.l @(r0,r15),r6 ! pc
mov.l 1f,r0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: Fixing up unaligned userspace access
2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
2008-03-07 14:44 ` Adrian McMenamin
2008-03-12 18:41 ` Kieran Bingham
@ 2008-03-13 10:39 ` Paul Mundt
2008-03-14 14:10 ` Kieran Bingham
2008-03-21 8:49 ` Paul Mundt
4 siblings, 0 replies; 6+ messages in thread
From: Paul Mundt @ 2008-03-13 10:39 UTC (permalink / raw)
To: linux-sh
On Wed, Mar 12, 2008 at 06:41:57PM +0000, Kieran Bingham wrote:
> On 07/03/2008, Adrian McMenamin <adrian@newgolddream.dyndns.info> wrote:
> > On Fri, March 7, 2008 2:30 pm, Kieran Bingham wrote:
> > >
> > > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
> > > ins=0x60f0
> > >
> > > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
> > > ins=0x9015
> > > Mainly I'm confused though, as looking at the assembly the instructions
> > > aren't unaligned ? and don't seem to be trying to do anything
> > > un-aligned ? - Is there something else that could be going on here?
>
> > My experience - admittedly with SH4 - is that these errors are almost
> > always caused by a memory leak, or access to uninitialized memory, elsewhere
> > in the kernel.
>
> The instructions don't look like they perform anything bad, because they don't!
>
> They are the wrong instructions. ... would seem that the address error
> trap handler is adding 4 to the regs pointer before it calls
> do_address_error, so regs->pc was actually returning the PR!!
>
> Does anyone know why this code adds 4 ?
> Can we remove it if its incorrect (patch below)
>
> I'm working on SH2a, so I don't know if its an SH2 specific thing
> thats been put in ?
>
It's definitely not an SH-2 thing. I wonder if it's a left over remnant
from when we were placing markers on the stack in the early days of the
SH-2 port. Anyways, dumping the stack from the address error path makes
it pretty obvious that the add is forcing all of the state to be off by
one register, which also explains why the regs->sr check was failing.
Given that, I'll add your patch to the 2.6.25 queue. Though it seems like
there are still a few corruption issues outstanding, which the slab
caches in particular seem to hit. There's occasional garbage in regs->pc,
which suggests that the exceptions are nesting and we're hitting the case
where the saved PC value is undefined. I've been debugging this most of
the day, and found a number of other bugs in the nommu code, but most of
the corruption issues are still outstanding (using current git both on
the SH7203 RSK and the SH7206 SolutionEngine).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fixing up unaligned userspace access
2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
` (2 preceding siblings ...)
2008-03-13 10:39 ` Paul Mundt
@ 2008-03-14 14:10 ` Kieran Bingham
2008-03-21 8:49 ` Paul Mundt
4 siblings, 0 replies; 6+ messages in thread
From: Kieran Bingham @ 2008-03-14 14:10 UTC (permalink / raw)
To: linux-sh
Hmmm couple of issues come from looking at the Address Error Exception Handling.
If I find the offending code statement in HEW and set breakpoints
before and after, and use the continue button to iterate through the
instructions, I get the exception fired on the correct line, and all
of the regs match my expectations :
root:/home/nfs/apps> ./unaligned-access
Pid : 56, Comm: unaligned-acces
PC is at 0xcfa0112
PC : 0cfa0112 SP : 0cfbff34 SR : 00000001 Not tainted
R0 : 0cfa00f0 R1 : f2f9f1f9 R2 : 0cfbff88 R3 : 0cfbff21
R4 : 00000001 R5 : 0cfbff6c R6 : 0cfbff74 R7 : ffffffff
R8 : 0cfbff6c R9 : 0cfbff74 R10 : 00000001 R11 : 0c31ffc0
R12 : 0c3083dc R13 : 0c2e06a8 R14 : 0cfbff34
MACH: 00000000 MACL: 00000015 GBR : 00000000 PR : 0cfa0242
Fixing up unaligned userspace access in "unaligned-acces" pidV
pc=0x0cfa0112 ins=0x6112
instruction : 6112
Fixing up a Mov.[bwl] Insa12, rmÍ35fa4, rnÍ35fa4, *rmòf9f1f9, count=4
calling copy_from_user(dst : cd35fa4, src : f2f9f1f9, count 4)
Killing process "unaligned-acces" due to unaligned access
SIGSEGV
execution broke at 0x0cfa0112 as expected and handler can try and deal
with the correct lines.
However, If i just let it run its course, with no interference from HEW,
Pid : 57, Comm: unaligned-acces
PC is at 0xcfa0116
PC : 0cfa0116 SP : 0cfbff34 SR : 00000001 Not tainted
R0 : 0cfa00f0 R1 : f2f9f1f9 R2 : 0cfbff88 R3 : 0cfbff21
R4 : 00000001 R5 : 0cfbff6c R6 : 0cfbff74 R7 : ffffffff
R8 : 0cfbff6c R9 : 0cfbff74 R10 : 00000001 R11 : 0c31ffc0
R12 : 0c3083dc R13 : 0c2e06f8 R14 : 0cfbff34
MACH: 00000000 MACL: 00000015 GBR : 00000000 PR : 0cfa0242
Fixing up unaligned userspace access in "unaligned-acces" pidW
pc=0x0cfa0116 ins=0x0009
instruction : 9
Code from above:
0CFA010A 0009 NOP
0CFA010C 0009 NOP
0CFA010E 0009 NOP
0CFA0110 51E1 MOV.L @(H'04:4,R14),R1 # R1 is
0x4466 at this point
0CFA0112 6112 MOV.L @R1,R1
0CFA0114 1E13 MOV.L R1,@(H'0C:4,R14)
0CFA0116 0009 NOP
0CFA0118 0009 NOP
0CFA011A 0009 NOP
Is there some sort of timing issue here where by the time the
exception is raised, the CPU has already started to execute the
following instructions ? Something in a pipeline perhaps ? Surely when
an address error occurs - thats it - it should stop ? But maybe I'm
missing something...
Thoughts / Comments anyone?
--
Cheers
Kieran
On 13/03/2008, Paul Mundt <lethal@linux-sh.org> wrote:
> On Wed, Mar 12, 2008 at 06:41:57PM +0000, Kieran Bingham wrote:
> > On 07/03/2008, Adrian McMenamin <adrian@newgolddream.dyndns.info> wrote:
> > > On Fri, March 7, 2008 2:30 pm, Kieran Bingham wrote:
> > > >
> > > > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccda168
> > > > ins=0x60f0
> > > >
> > > > Fixing up unaligned userspace access in "sh" pidH pc=0x0ccd9eac
> > > > ins=0x9015
> > > > Mainly I'm confused though, as looking at the assembly the instructions
> > > > aren't unaligned ? and don't seem to be trying to do anything
> > > > un-aligned ? - Is there something else that could be going on here?
> >
> > > My experience - admittedly with SH4 - is that these errors are almost
> > > always caused by a memory leak, or access to uninitialized memory, elsewhere
> > > in the kernel.
> >
> > The instructions don't look like they perform anything bad, because they don't!
> >
> > They are the wrong instructions. ... would seem that the address error
> > trap handler is adding 4 to the regs pointer before it calls
> > do_address_error, so regs->pc was actually returning the PR!!
> >
> > Does anyone know why this code adds 4 ?
> > Can we remove it if its incorrect (patch below)
> >
> > I'm working on SH2a, so I don't know if its an SH2 specific thing
> > thats been put in ?
> >
>
> It's definitely not an SH-2 thing. I wonder if it's a left over remnant
> from when we were placing markers on the stack in the early days of the
> SH-2 port. Anyways, dumping the stack from the address error path makes
> it pretty obvious that the add is forcing all of the state to be off by
> one register, which also explains why the regs->sr check was failing.
>
> Given that, I'll add your patch to the 2.6.25 queue. Though it seems like
> there are still a few corruption issues outstanding, which the slab
> caches in particular seem to hit. There's occasional garbage in regs->pc,
> which suggests that the exceptions are nesting and we're hitting the case
> where the saved PC value is undefined. I've been debugging this most of
> the day, and found a number of other bugs in the nommu code, but most of
> the corruption issues are still outstanding (using current git both on
> the SH7203 RSK and the SH7206 SolutionEngine).
>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Fixing up unaligned userspace access
2008-03-07 14:30 Fixing up unaligned userspace access Kieran Bingham
` (3 preceding siblings ...)
2008-03-14 14:10 ` Kieran Bingham
@ 2008-03-21 8:49 ` Paul Mundt
4 siblings, 0 replies; 6+ messages in thread
From: Paul Mundt @ 2008-03-21 8:49 UTC (permalink / raw)
To: linux-sh
On Fri, Mar 14, 2008 at 02:10:59PM +0000, Kieran Bingham wrote:
> Fixing up unaligned userspace access in "unaligned-acces" pidW
> pc=0x0cfa0116 ins=0x0009
> instruction : 9
>
> Code from above:
>
> 0CFA010A 0009 NOP
> 0CFA010C 0009 NOP
> 0CFA010E 0009 NOP
> 0CFA0110 51E1 MOV.L @(H'04:4,R14),R1 # R1 is
> 0x4466 at this point
> 0CFA0112 6112 MOV.L @R1,R1
> 0CFA0114 1E13 MOV.L R1,@(H'0C:4,R14)
> 0CFA0116 0009 NOP
> 0CFA0118 0009 NOP
> 0CFA011A 0009 NOP
>
>
> Is there some sort of timing issue here where by the time the
> exception is raised, the CPU has already started to execute the
> following instructions ? Something in a pipeline perhaps ? Surely when
> an address error occurs - thats it - it should stop ? But maybe I'm
> missing something...
>
It would be helpful to know what the stack layout is at the time you
enter the exception, both from HEW and from regular processing. The saved
PC value in this case is the next instruction to be executed, and even if
it's partially split out in the pipeline, the exec stage should not be
hit until execution resumes.
Looking at address_error_trap_handler(), I wonder if we have an
inconsistency between the stack-relative saved (ie, OFF_PC) PC and the
saved PC on the top of the stack pushed by the hardware on top of the
saved SR before the processing begins. You may wish to pop the saved PC
off and toss that in r7 or some such thing so you can more easily
compare. The only issue is that the hardware-saved PC at the top of the
stack can occasionally be undefined (ie, in the case of nested
exceptions), in which case we have to go to the OFF_PC value on-stack
regardless. This is fairly easy to test for however.
^ permalink raw reply [flat|nested] 6+ messages in thread